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2 (57) Abstract: The present invention provides novel isolated NOVX polynucleotides and polypeptides encoded by the NOVX 
polynucleotides. Also provided are the antibodies that immunospeciflcally bind to a NOVX polypeptide or any derivative, variant, 
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Novel Polypeptides and Nucleic acids Encoding Same 



Background of the Invention 

The invention relates generally to nucleic acids and polypeptides. 

5 

Summary of the Invention 

The present invention is based, in part, upon the discovery of novel human nucleic acid 
sequences encoding polypeptides. The NOV-X nucleic acids, polynucleotides, proteins, and 
polypeptides or fragments thereof described herein collectively include NOV-1, NOV-2a, and 
10 NOV-2b, which are novel KIAA1233-like polypeptides; NOV-3a, NOV-3b, NOV-3c, and 
NOV-3d, which are novel STE20-hke polypeptides; NOV-4a, NOV-4b, NOV-4c, NOV-4d, 
and NOV-4e, wliich are novel trypsin inhibitor-like polypeptides. 

In one aspect, the invention includes an isolated NO V-X nucleic acid molecule which 
includes a nucleotide sequence encoding a polypeptide that includes the amino acid sequence 
15 of SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23. For example, in various 

embodiments, the nucleic acid can include a nucleotide sequence that includes SEQ ID NO: 1, 
3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57. Alternatively, the encoded NOV-X polypeptide may 
have a variant amino acid sequence, e.g,, have an identity or similarity less than 100% to the 
disclosed amino acid sequences, as described herein. 

20 The invention also includes an isolated polypeptide that includes the amino acid 

''sequence of SEQ ED NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23, or a fragment having at 
least 6 ainino acids of these amino acid sequences. Also included is a naturally occurring 
polypeptide variant of a NO V-X polypeptide, wherein the polypeptide is encoded by a nucleic 
acid molecule which hybridizes imder stringent conditions to a nucleic acid molecule 

25 . consisting of a NOV-X nucleic acid molecule. 

Also included in the invention is an antibody that selectively binds to a NOV-X 
polypeptide. The antibody is preferably a monoclonal antibody, and most preferably is a 
human antibody. Such antibodies are usefixl, for example, in the treatment of a pathological 
state in a subject wherein the treatment includes administering the antibody to the subject. 



1 
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The invention further includes a method for producing a NOV-X polypeptide by 
culturing a host cell expressing one of the herein described NOV-X nucleic acids under 
conditions in which the nucleic acid molecule is expressed. 

The invention also includes methods for detecting the presence of a NOV-X 
5 polypeptide or nucleic acid in a sample j&om a mammal, e.g., a human, by contacting a sample 
from the mammal with an antibody which selectively binds to one of the herein described 
polypeptides, and detecting the formation of reaction complexes including the antibody and 
the polypeptide in the sample. Detecting the formation of complexes in the sample indicates 
the presence of the polypeptide in the sample. 

10 The invention fiirther includes a method for detecting or diagnosing the presence of a 

disease, e.g., a pathological condition, associated with altered levels of a polypeptide having 
an amino acid sequence at least 80% identical to a NOV-X polypeptide in a sample. The 
method includes measuring the level of the polypeptide in a biological sample from the 
mammalian subject, e.g., a human, and comparing the level detected to a level of the 

15 polypeptide present in normal subjects, or in the same subject at a different time, e.g.^ prior to 
onset of a condition. An increase or decrease in the level of the polypeptide as compared to 
normal levels indicates a disease condition. 

Also included in the invention is a method of detecting the presence of a NOV-X 
nucleic acid molecule in a sample, from a mammal, e.g., a human. The method includes 
20 contacting the sample with a nucleic acid probe or primer which selectively hybridizes to the 
nucleic acid molecule and determining whether the nucleic acid probe or primer binds to a 
''nucleic acid molecule in the sample. Binding of the nucleic acid prpbe or primer indicates the 
nucleic acid molecule is present m the sample. 

The invention ftirther includes a method for detecting or diagnosing the presence of a 
25 disease associated wifli altered levels of a NOV-X nucleic acid in a sample from a mammal, 
e.g,. a human. The method includes measuring the level of the nucleic acid in a biological 
sample from the mammalian subject and comparing the level detected to a level of the nucleic 
acid present in normal subjects, or in the same subject at a different time. An increase or 
decrease in the level of the nucleic acid as compared to nomial levels indicates a disease 
30 condition. 
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The invention also includes a method of treating a pathological state in a mammal, e.^,. 
a human, by administering to the subject a NOV-X polypeptide to the subject in an amount 
sufficient to alleviate the pathological condition. The polypeptide has an amino acid sequence 
at least 80% identical to a NOV-X polypeptide. 

5 Alternatively, the mammal may be treated by administering an antibody as herein 

described in an amount sufficient to alleviate the pathological condition. 

Pathological states for which the methods of treatment of the invention are envisioned 
include hematopoietic, immunological, tumor, cancer, neurodegenerative (e.g. Alzheimer's 
and Parkinson's disease) and fertility disorders. 

Unless otherwise defined, all technical and scientific terms used herein have the same 
meaning as commonly understood by one of ordinary skill in the art to wliich this invention 
belongs. Although methods and materials similar or equivalent to those described herein can 
be used in the practice or testing of the present iuvention, suitable methods and materials are 
described below. All publications, patent applications, patents, and other references 
mentioned herein are incorporated by reference in their entirety. In the case of conflict, the 
present specification, including definitions, will control. In addition, the materials, methods, 
and examples are illustrative only and not intended to be limiting. 

Other features and advantages of the invention will be apparent firom the following 
detailed description and claims. 

Detailed Description of the Invention 

The present invention is based, in part, upon the discovery of novel human nucleic acid 
sequences and of polypeptides encoded by these nucleic acids. The nucleic acids have been 
named **NOV 1-4", or collectively, ''NOV-X". Representative NOV-X sequences, and 
representative examples of uses of these sequences, are briefly discussed below. 

Table 1 provides a summary of the NOV-X nucleic acids, their encoded polypeptides 
and homology. 



TABLE 1> Sequences and Corresponding SEP ID Numbers 



NOVX 
Assignment 


Internal 
Identification 


SEQID 

NO 
(nucleic 

acid) 


SEQ ID NO 
(pol3^eptide) 


Homology 


1 


10132038.0.67 


1 


2 


KIAA1233 protein 


2a 


10132038,0.139 


3 


4 


KIAA1233 protein 


2b 


10132038.0.136 


57 


5 


KIAA1233 protein 



3 



BNSDCCID; <WO 0162928A2_I_> 



10 



15 



20 



25 



wo 01/62928 



PCT/USOl/06151 



3a 


18552586 EXTl 


6 


7 


STE20 protein kinase 


3b 


18552586_EXT2 


8 


9 


STE20 protein kinase 


3c 


18552586_EXT3 


10 


11 


STE20 protein kinase 


3d 


18552586_EXT4 


12 


13 


STE20 protein kinase 


4a 


10093872.0.107 


14 


15 


Trypsin inhibitor 


4b 


10093872.1 


16 


17 


Trypsin inhibitor 


4c 


10093872.0.38 


18 


19 


Trypsin inhibitor 


4d 


10093872.2 


20 


21 


Trypsin inhibitor 


4e 


10093872.3 


22 


23 


Trypsin inhibitor 



NOV-1: A Novel ]KIAA1233-Uke Polypeptide 

A NOV-1 sequence according to the invention is a nucleotide sequence encoding a 
polypeptide related to KIAA1233 proteins, which bear sequence similarity to lacunin, 
5 ihrombospondins, proteinases, semaphorins, ADAM-TS, and properdin family members. This 
invention maps to Unigene cluster Hs. 18705. This cluster has been mapped to Chromosome 
1 5 Marker stSG35204, Interval D 1 5S 11 5-D 1 5S 1 52. By integrating information from the 
Online Mendelian Inheritance in Man (OMIM), this region is identified as 15q22- 
qter. Therefore, the chromosomal location of the invention is Chromosome 15 Marker 
10 stSG35204, Interval D15S1 15-D15S152, which corresponds to 15q22-qter. 

The nucleic acid of the invention, NOV-1, encoding a KIAA1233-like protein 
originating from chromosome 15, is shown in TABLE 2. The disclosed nucleic acid (SEQ ED 
NO; 1) is a fiill-length clone of 1281 nucleotides and contains an open reading frame (OKF) 
that begins with an ATG initiation codon at nucleotide 416 and ends with a TAA stop codon at 
15 nucleotides 4259. A representative ORF encodes a 1281 amino acid polypeptide (SEQ ID 
NO: 2). The initiation and stop codons of SEQ ID NO: 1 are shown in bold font. Putative 
xmtranslated regions are upstream of the initiation codon and downstream of the stop codon in 
SEQ ID NO: 1. 

20 

TABLE 2 

TAATAGAGACCTTTCAAAGGACAAATTCTGTGAAATAAAGTGGTTTTCTGTVAGAGCCTAC 
TAATAGGACAGTGTGTTAATATCACTAATAAGAGAGTAATGATTATAAAAAGGAATAAAT 
TTATTGAAATTGC7VAGATACTTTTCTCCTTTGATTAATATACTGCTAGTTTAGTTTTCTA 

25 CATTTTCAAATAGAACTGGGGAATTTGTGTCGTAGATATTCTTGACAACTAAAGAGATGG 
TGGCTGAATTTTTGGGAATGGTTGATAACACTTGATATTTTTAGTTTCCAATTTGGAAGA 
GCTCTGTCTCTTGGGATGTCAAATATTATATTCGTCT^TTAATGAATGTGTTAATTTATT 
ATAGAAATGATATTCTCACAATGATTTCATTTGTAGTGATGGATTTAAAGAGATAATGCC 
CTATGACCACTTCC7\ACCTCTTCCTCGCTGGGAACATAATCCTTGGACTGCATGTTCCGT 

30 GTCCTGTGGAGGAGGGATTCAGAGACGGAGCTTTGTGTGTGTAGAGGAATCCATGCATGG 
AGAGATATTGCAGGTGGAAGAATGGAAGTGCATGTACGCACCCAAACCCAAGGTTATGCA 

4 
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AACTTGTAATCTGTTTGATTGCCCCAAGTGGATTGCCATGGAGTGGTCTCAGTGCACAGT 
GACTTGTGGCCGAGGGTTACGGTACCGGGTTGTTCTGTGTATTAACCACCGCGGAGAGCA 
TGTTGGGGGCTGCAATCCACAACTGAAGTTACACATCAAAGT^GAATGTGTCATTCCCAT 
CCCGTGTTATAAACCAAAAGAAAAT^GTCCAGTGGAAGCAAAATTGCCTTGGCTGAAACA 

agcacaagaactagaagagaccagaatagcaacagaagaaccaacgttcattccagaacc 
ctggtcagcctgcagtaccacgtgtgggccgggtgtgcaggtccgtgaggtgaagtgccg 
tgtgctcctcacattcacgcagactgagactgagctgcccgaggaagagtgtgaaggccc 
caagctgcccaccgaacggccctgcctcctggaagcatgtgatgagagcccggcctcccg 
agagctagacatccctctccctgaggacagtgagacgacttacgactgggagtacgctgg 

GTTCACCCCTTGCACAGCTU^CATGCGTGGGAGGCCATCT^GAAGCCATAGCAGTGTGCTT 
ACATATCCAGACCCAGCAGACAGTCAATGACAGCTTGTGTGATATGGTCCACCGTCCTCC 

agccatgagccaggcctgtaacacagagccctgtccccccaggtggcatgtgggctcttg 
ggggccctgctcagctacctgtggagttggaattcagacccgagatgtgtactgcctgca 
cccaggggagacccctgcccctcctgaggagtgccgagatgaaaagccccatgctttaca 
agcatgcaatcagtttgactgccctcctggctggcacattgaagaatggcagcagtgttc 
caggacttgtggcgggggaactcagaacagaagagtcacctgtcggcagctgctaacgga 
tggcagctttttgaatctctcagatgaattgtgccaaggacccaaggcatcgtctcacaa 
gtcctgtgccaggagagactgtcctccacatttagctgtgggagactggtcgaagtgttc 
tgtcagttgtggtgttggaatccagagaagaaagcaggtgtgtcaaaggctggcagccaa 
aggtcggcgcatccccctcagtgagatgatgtgcagggatctaccagggctccctcttgt 
aagatcttgccagatgcctgagtgcagtaaaatcaaatcagagatgaagacaaaacttgg 
tgagcagggtccgcagatcctcagtgtccagagagtctacattcagacaagggaagagaa 

GCGTATTAACCTGACCATTGGTAGCAGAGCCTATTTGCTGCCCAACACATCCGTGATTAT 

taagtgcccagtgcgacgattccagaaatctctgatccagtgggagaaggatggccgttg 
cctgcag7u\ctccaaacggcttggcatcaccaagtcaggctcactaaaaatccacggtct 
tgctgcccccgacatcggcgtgtaccggtgcattgcaggctctgcacaggaaacagttgt 
gctcaagctcattggtactgacaaccggctcatcgcacgcccagccctcagggagcctat 
gagggaatatcctgggatggaccacagcgaagccaatagtttgggagtcacatggcacaa 
aatgaggcaaatgtggaataacaaaaatgacctttatctggatgatgaccacattagtaa 
ccagcctttcttgagagctctgttaggccactgcagcaattctgcaggaagcaccaactc 
ctgggagttgaagaataagcagtttgaagcagcagttaaacaaggagcatatagcatgga 
tacagcccagtttgatgagctgataagaaacatgagtcagctcatggaaaccggagaggt 
cagcgatgatcttgcgtcccagctgatatatcagctggtggccgaattagccaaggcaca 
gccaacacacatgcagtggcggggcatccaggaagagacacctcctgctgctcagctcag 
aggggaaacagggagtgtgtcccaaagctcgcatgcaaaaaactcaggcaagctgacatt 
caagccgaaaggacctgttctcatgaggcaaagccaacctccctcaatttcatttaataa 
aacaataaattccaggattggaaatacagtatacattacaaaaaggacagaggtcatcaa 
tatactgtgtgaccttattacccccagtgaggccacatatacatggaccaaggatggaac 
cttgttacagccctcagtaaaaataattttggatggaactgggaagatacagatacagaa 
tcctacaaggaaagaacaaggcatatatgaatgttctgtagctaatcatcttggttcaga 

TGTGGAAAGTTCTTCTGTGCTGTATGCAGAGGCACCTGTCATCTTGTCTGTTGAAAGAAA 

tatcaccaaaccagagcacaaccatctgtctgttgtggttggaggcatcgtggaggcagc 
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CCTTGGAGCAAACGTGACAATCCGATGTCCTGTAAAAGGTGTCCCTCAGCCTAATATAAC 
TTGGTTGAAGAGAGGAGGATCTCTGAGTGGCAATGTTTCCTTGCTTTTCAATGGATCCCT 
GTTGTTGCAGAATGTTTCCCTTGAAAATGAAGGAACCTACGTCTGCATAGCCACCAATGC 
TCTTGGAAAGGCAGTGGCAACATCTGTACTCCACTTGCTGGAACGAAGATGGCCAGAGAG 
5 TAGAATCGTATTTCTGCAAGGACATAAAAAGTACATTCTCCAGGCAACCAACACTAGAAC 
CAACAGCAATGACCCAACAGGAGAACCCCCGCCTCAAGAGCCTTTTTGGGAGCCTGGTAA 
CTGGTCACATTGTTCTGCCACCTGTGGTCATTTGGGAGCCCGCATTCAGAGACCCCAGTG 
TGTGATGGCCAATGGGCAGGAAGTGAGTGAGGCCCTGTGTGATCACCTCGAGAAGCCACT 
GGCTGGGTTTGAGCCCTGTAACATCCGGGACTGCCCAGCGAGGTGGTTCACAAGTGTGTG 

10 GTCACAGTGCTCTGTGTCTTGCGGTGAAGGATACCACAGTCGGCAGGTGACGTGCAAGCG 
GACAAAAGCCAATGGAACTGTGCAGGTGGTGTCTCCAAGAGCATGTGCCCCTAAAGACCG 
GCCTCTGGGAAGAAAACCATGTTTTGGTCATCCATGTGTTCAGTGGGAACCAGGGAACCG 
GTGTCCTGGACGTTGCATGGGCCGTGCTGTGAGGATGCAGCAGCGTCACACAGCTTGTCA 
ACACAACAGCTCTGACTCCAACTGTGATGACAGAAAGAGACCCACCTTAAGAAGGAACTG 

15 CACATCAGGGGCCTGTGATGTGTGTTGGCACACAGGCCCTTGGAAGCCCTGTACAGCAGC 
CTGTGGCAGGGGTTTCCAGTCTCGGAAAGTCGACTGTATCCACACAAGGAGTTGCAAACC 
TGTGGCCAAGAGACACTGTGTACAGAAAAAGAAACCAATTTCCTGGCGGCACTGTCTTGG 
GCCCTCCTGTGATAGAGACTGCACAGACACAACTCACTACTGTATGTTTGTAAAACATCT 
TAATTTGTGTTCTCTAGACCGCTACAAACAAAGGTGCTGCCAGTCATGTCAAGAGGGATA 

20 AACCTTTGGAGGGGTCATGATGCTGCTGTGAAGATAAAAGTAGAATATAAAAGCTCTTTT 
CCCCATGTCGCTGATTCAAAAACATGTATTTCTTAAAAGACTAGATTCTATGGATCAAAC 
AGAGGTTGATGCAAAAACACCACTGTTAAGGTGTAAAGTGAAATTTTCCAATGGTAGTTT 
TATATTCCAATTTTTTAAAATGATGTATTCAAGGATGAACAAAATACTATAGCATGCATG 
CCACTGCACTTGGGACCTCATCATGTCAGTTGAATCGAGAAATCACCAAGATTATGAGTG 

25 CATCCTCACGTGCTGCCTCTTTCCTGTGATATGTAGACTAGCACAGAGTGGTACATCCTA 
AAAACTTGGGAAACACAGCAACCCATGACTTCCTCTTCTCTCAAGTTGCAGGTTTTCAAC 
AGTTTTATAAGGTATTTGCATTTTAGAAGCTCTGGCCAGTAGTTGTTAAGATGTTGGCAT 
TAATGGCATTTTCATAGATCCTTGGTTTAGTCTGTGAAAAAGAAACCATCTCTCTGGATA 
- GGCTGTCACACTGACTGACCTAAGGGTTCATGGAAGCATGGCATCTTGTCCTTGCTTTTA . 

30 GAACACCCATGGAAGAAAACACAGAGTAGATATTGCTGTCATTTATACAACTACAGAAAT 
TTATCTATGACCTAATGAGGCATCTCGGAAGTCAAAGAAGAGGGAAAGTTAACCTTTTCT 
ACTGATTTCGTAGTATATTCAGAGCTTTCTTTTAAGAGCTGTGAATGAAACTTTTTCTAA 
GCACTATTCTATTGCACACAT^CAGAAAACCA/IAGCCTTATTAGACCTAATTTATGCATA 
AAGTAGTATTCCTGAGAACTTTATTTTGGAAAATTTATAAGAAAGTAATCCAAATAAGAA 

35 ACACGATAGTTGAAAATAATTTTTATAGTAAATAATTGTTTTGGGCTGATTTTTCAGTAA 
ATCCAAAGTGACTTAGGTTAGAAGTTACACTAAGGACCAGGGGTTGGAATCAGAATTTAG 
TTTAAGATTTGAGGAAAAGGGT7VAGGGTTAGTTTCAGTTTTAGGATTAGAGCTAGAATTG 
GGTTAGGTGAGAAAGAAAGTTAAGGTTAAGGCTAGAGTTGTCTTTAAGGGTTAGGGTTAG 
GACCAGGTTAGGTCAGGGTTGGATTGGGTTTAGATTGGGGCCAGTGCTGGTGTTAGTGAT 

40 AGTGTCAGGATGGAGGTTAGGTTTGGAGTAAGCGTTGTTGCTGAAGTGAGTTCAGGCTAG 
CATTAAATTGTAAGTTCTGAAGCTGATTTGGTTATGGGGTCTTTCCCCTGTATACTACCA 
GTTGTGTCTTTAGATGGCACACAAGTCCAAATAAGTGGTCATACTTCTTTATTCAGGGTC 

6 
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TCAGCTGCCTGTACACCTGCTGCCTACATCTTCTTGGCAACAAftGTTACCTGCCACAGGC 
TCTGCTGAGCCTAGTTCCTGGTCAGTAATAACTGAACAGTGCATTTTGGCTTTGGATGTG 
TCTGTGGACAAGCTTGCTGAGTTTCTCTACCATATTCTGAGCACACGGTCTCTTTTGTTC 
TAATTTCAGCTTCACTGACACTGGGTTGAGCACTACTGTATGTGGAGGGTTTGGTGATTG 
GGAATGGATGGGGGACAGTGAGGAGGACACACCAGCCCATTAGTTGTTAATCATCAATCA 
CATCTGATTGTTGAAGGTTATTAAATTAAAAGAAAGATCATTTGTAACATACTCTTTGTA 
TATATTTATTATATGAAAGGTGCAATATTTTATTTTGTACAGTATGTAATAAAGACATGG 
GACATATATTTTTCTTATTAACAAAATTTCATATTAAATTGCTTCACTTTGTATTTAAAG 
TTAAAAGTTACTATTTTTCATTTGCTATTGTACTTTCATTGTTGTCATTCAATTGACATT 
CCTGTGTACTGTATTTTACTACTGTTTTTATAACATGAGAGTTAATGTTTCTGTTTCATG 
ATCCTTATGTAATTCAGAAATAAATTTACTTTGATTATTCAGTGGCATCCTTAT (SEQ ID NO: 1) 



MPYDHFQPLPRWEHNPWTACSVSCGGGIQRRSFVCVEESMHGEILQVEEWKCMYAPKPKVMQTCNLFDCPKWIAME 
WSQCTVTCGRGLRYRVVLCINHRGEHVGGCNPQLKLHIKEECVIPIPCYKPKEKSPVEAKLPWLKQAQELEETRIA 
TEEPTFIPEPWSACSTTCGPGVQVREVKCRVLLTFTQTETELPEEECEGPKLPTERPCLLEACDESPASRELDIPL 
PEDSETTYDWEYAGFTPCTATCVGGHQEAIAVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGP 
CSATCGVGIQTRDVYCLHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQNRRVTCRQLL 
TDGSFLNLSDELCQGPKASSHKSCARTDCPPHLAVGDWSKCSVSCGVGIQRRKQVCQRLAAKGRRIPLSEMMCRDL 
PGLPLVRSCQMPECSKIKSEMKTKLGEQGPQILSVQRVYIQTREEKRINLTIGSRAYLLPNTSVIIKCPVRRFQKS 
LIQWEKDGRCLQNSKRLGITKSGSLKIHGLAAPDIGVYRCIAGSAQETWLKLIGTDNRLIARPALREPMREYPGM 
DHSEANSLGVTWHKiyiRQMWNNKNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKNKQFEAAVKQGAYSMDTA 
QFDELIRNMSQLMETGEVSDDLASQLIYQLVAELAKAQPTHMQWRGIQEETPPAAQLRGETGSVSQSSHAKNSGKL 
TFKPKGPVLMRQSQPPSISFNKTINSRIGNTVYITKRTEVINILCDLITPSEATYTWTKDGTLLQPSVKIILDGTG 
KIQIQNPTRKEQGIYECSVANHL'GSDVESSSVLYAEAPVILSVERNITKPEHNHLSWVGGIVEAALGANVTIRCP 
VKGVPQPNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGKAVATSVLHLLERRWPESRIVFLQ 
GHKKYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATCGHLGARIQRPQCVMANGQEVSEALCDHLQKPLAG 
FEPCNIRDCPARWFTSVWSQCSVSCGEGYHSRQVTCKRTECANGTVQWSPRACAPKDRPLGRKPCFGHPCVQWEPG 
.-NRCPGRCMGRAVRMQQRHTACQHNSSDSNCDDRKRPTLRRNCTSGACDVCWHTGPWKPCTAACGRGFQSRKVDCIH 
TRSCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRYKQRCCQSCQEG (SEQ ID 

NO: 2) 

In a search of sequence databases, it was found, for example, that the disclosed NOV-1 
nucleotide sequence has 5106 of 5107 bases (99%) identical to a human mRNA for a 
KIAA1233 protein (SECR) (GenBank Accession No: ABO33059), as shown m Table 3. In all 
sequence alignments, identical residues are depicted as "f'- As indicated by the "Expect" 
value, the probability of this alignment occurring by chance alone is 0.0, the lowest 
probability. 

Furthermore, the encoded amino acid sequence has 1023 of 1023 amino acid residues 
(100%) identical to, and 1023 of 1023 residues (100 %) positive with, a 1023 amino acid 

7 
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residue human KIAA1233 protein (GenBank Accession No: BAA86547), as shown in Table 
4. As indicated by the "Expect" value, the probability of this alignment occuring by chance 
alone is 0, the lowest probability. 
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TABLE 3 

Score = 1.012e+04 bits (5103), Expect 
Identities = 5106/5107 (99%) 
Strand = Plus / Plus 



- 0.0 



NOVl: 
1247 

SECR 



1188 tagcagtgtgcttacatatccagacccagcagacagtcaatgacagcttgtgtgatatgg 

I I I I I I I I I I I I I I I I I I I I I I i I I I I I M I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I 
1 tagcagtgtgcttacatatccagacccagcagacagtcaatgacagcttgtgtgatatgg 60 



20 



NOVl: 
1307 

SECR 



124 8 tccaccgtcctccagccatgagccaggcctgtaacacagagccctgtccccccaggtggc 



61 



I I I I I I I I I I I I I I I M I I i I I M I I I I M I I I I I I I I I I M I I I M I I I I I I I I i M I I 
tccaccgtcctccagccatgagccaggcctgtaacacagagccctgtccccccaggtggc 120 



25 



NOVl: 
1367 

SECR 



1308 atgtgggctcttgggggccctgctcagctacctgtggagttggaattcagacccgagatg 



121 



I I I I I I I I ] I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
atgtgggctcttgggggccctgctcagctacctgtggagttggaattcagacccgagatg 180 



30 



NOVl: 
1427 



1368 tgtactgcctgcacccaggggagacccctgcccctcctgaggagtgccgagatgaaaagc 



SECR : 181 



I I I I I I I I I I I I I I I I I I i t I I I I I I I I I I I I I I I I I M I I M I I I I I I I I I I I I I I I M 
tgtactgcctgcacccaggggagacccctgcccctcctgaggagtgccgagatgaaaagc 240 



35 



40 



45 



NOVl: 
1487 

^SECR 



NOVl: 
1547 



14 28 cccatgctttacaagcatgcaatcagtttgactgccctcctggctggcacattgaagaat 



241 



I I I I I I 1 I I I I I I I I I I I I t I I I I I I [ 1 I I I I I I I I I I I I i 1 I I 1 ! I M I I 1 I I I I I I I I 
cccatgctttacaagcatgcaatcagtttgactgccctcctggctggcacattgaagaat 300 



1488 ggcagcagtgttccaggacttgtggcgggggaactcagaacagaagagtcacctgtcggc 



SECR : 301 



I I I I I I I I M I 1 I I I I t I I I I I I I 1 I I I I I I I i I I I I I I I I I 1 I I I I i I I I I I I I I i 1 M 
ggcagcagtgttccaggacttgtggcgggggaactcagaacagaagagtcacctgtcggc 360 



50 



NOVl: 
1607 

SECR 



1548 agctgctaacggatggcagctttttgaatctctcagatgaattgtgccaaggacccaagg 

I I I I I I 1 I I I I I I I I I I i I I I I I I I I I I I I Ml I I I I 1 M I N I I I I i I M I I I I I M I I 
361 ' agctgctaacggatggcagctttttgaatctctcagatgaattgtgccaaggacccaagg 420 



55 



NOVl : 
1667 

SECR 



1608 catcgtctcacaagtcctgtgccaggacagactgtcctccacatttagctgtgggagact 

I { I I I M I I I I I I I I I I M I I M I I I I I M I M I I I M I I I I I I I i I i I I M I I I I I I I I 
421 catcgtctcacaagtcctgtgccaggacagactgtcctccacatttagctgtgggagact 



480 
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NOVl: 
1727 

SECR 



1668 ggtcgaagtgttctgtcagttgtggtgttggaatccagagaagaaagcaggtgtgtcaaa 

I It I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I M I I M 1 I I I I I I I I I I I I I I I I 
4 81 ggt cgaagtgttctgt cagttgt ggtgtt ggaatccagagaagaaagcaggtgtgt caaa 



540 



NOVl; 
1787 



1728 ggctggcagccaaaggtcggcgcatccccctcagtgagatgatgtgcagggatctaccag 



M 1 I I I M I I I I I I t t I I i I I I M I I I I I I I I I I I I I I I I i I I I I I M I I M I I I I I i I I 
10 SECR : 541 ggctggcagccaaaggtcggcgcatccccctcagtgagatgatgtgcagggatctaccag 600 



15 



NOVl: 
1847 

SECR 



1788 ggctccctcttgtaagatcttgccagatgcctgagtgcagtaaaatcaaatcagagatga 

I I i I I I I i I i M I I I I I I I I I I I I I M I I I i I I I i I I I I I I I I I I I I I I I I I I I I I I I I I 
601 ggctccctcttgtaagatcttgccagatgcctgagtgcagtaaaatcaaatcagagatga 660 



20 



NOVl: 
1907 

SECR 



18 48 agacaaaacttggtgagcagggtccgcagatcctcagtgtccagagagtctacattcaga 

M I I I M I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
661 agacaaaacttggtgagcagggtccgcagatcctcagtgtccagagagtctacattcaga 720 



25 



30 



35 



NOVl : 
1967 

SECR 



NOVl: 
2027 

SECR 



NOVl: 
2087 



1908 caagggaagagaagcgtattaacctgaccattggtagcagagcctatttgctgcccaaca 

M I I I I I I I I I I I I I I I I I I I i M 1 I I. ! I I I I I I I I I I I I I I I I i I I 1 I I I M I I I I I I I 
721 caagggaagagaagcgtattaacctgaccattggtagcagagcctatttgctgcccaaca 780 

1968 catccgtgattattaagtgcccagtgcgacgattccagaaatctctgatccagtgggaga 

I i I 1 I i I I I i I I M I I I I i I I t I I i I I I I I M I I I I I I I I I I I I I I I I I I I M I I I I I I 
781 catccgt gattattaagtgccccgtgcgacgattccagaaatctctgatccagt gggaga 840 

2028 aggatggccgttgcctgcagaactccaaacggcttggcatcaccaagtcaggctcactaa 



M I I I I I I I I I I I I I I I I I I i I I I I I I I I I 1 i I I I I I I I I I I I I I I I I I I I I I I I I. I 1 I I 
40 SECR : 841 aggatggccgttgcctgcagaactccaaacggcttggcatcaccaagtcaggctcactaa 900 



45 



'"NOVl : 
2147 

SECR 



2088 aaatccacggtcttgctgcccccgacatcggcgtgtaccggtgcattgcaggctctgcac 

I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 
901 aaatccacggtcttgctgcccccgacatcggcgtgtaccggtgcattgcaggctctgcac 960 



50 



55 



60 



NOVl: 
2207 

SECR 
1020 



NOVl: 
2267 

SECR 
1080 



2148 aggaaacagtt gtgct caagctcattggt actgacaaccggctcat cgcacgcccagccc 

I I I I M I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 
961 aggaaacagttgtgctcaagctcattggtactgacaaccggct cat cgcacgcccagccc 



2208 tcagggagcctatgagggaatatcctgggatggaccacagcgaagccaatagtttgggag 

I I I i I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I M I I I I I 1 I I i I I i 1 I I I I I I I 
1021 tcagggagcctatgagggaatatcctgggatggaccacagcgaagccaatagtttgggag 
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NOVl: 
2327 

SECR 
1140 



2268 tcacatggcacaaaatgaggcaaatgtggaataacaaaaatgacctttatctggatgatg 

I I I I I I M I I I I I I I i I I I I 1 I I I I I I I 1 I I I I I I I 1 I I I I I I I I I I M I I 1 I I I 1 I I I I 
1081 tcacatggcacaaaatgaggcaaatgtggaataacaaaaatgacctttatctggatgatg 



NOVl : 2328 accacattagtaaccagcctttcttgagagctctgttaggccactgcagcaattctgcag 
2387 

10 I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I 

SECR : 1141 accacattagtaaccagcctttcttgagagctctgttaggccactgcagcaattctgcag 
1200 



15 NOVl : 2388 gaagcaccaactcctgggagttgaagaataagcagtttgaagcagcagttaaacaaggag 
2447 

I I M Ij I I I I I I I I I I I I I I I i I f I I I I 1 i I I I I I I M I I I I I M I I I I M I I I t I I I I I 
SECR : 1201 gaagcaccaactcctgggagttgaagaataagcagtttgaagcagcagttaaacaaggag 
1260 

20 

NOVl : 24 4 8 catatagcatggatacagcccagtttgatgagctgataagaaacatgagtcagctcatgg 
2507 

I I I I I I I I I I I I I I I I M I I I M I M I I I I i I I I I I I I I I I I I M I I I I M I I I I I I I I I 
25 SECR : 1261 catatagcatggatacagcccagtttgatgagctgataagaaacatgagtcagctcatgg 
1320 



NOVl : 2508 aaaccggagaggtcagcgatgatcttgcgtcccagctgatatatcagctggtggccgaat 
30 2567 

M I M I I I I I I i I I I 1 I I I I I I I I I I I I 1 I I I I I i I I I I M I I I I I 1 I I I I I I I I 1 I I I I 
SECR : .1321 aaaccggagaggtcagcgatgatcttgcgtcccagctgatatatcagctggtggccgaat 
1380 

35 

NOVl : 2568 tagccaaggcacagccaacacacatgcagtggcggggcatccaggaagagacacctcctg 
2627 

I I I I I I I I M I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I M I I I I I I I i M I I I I I 
SECR : 1381 tagccaaggcacagccaacacacatgcagtggcggggcatccaggaagagacacctcctg 
40 1440 



NOVl : 2628 ctgctcagctcagaggggaaacagggagtgtgtcccaaagctcgcatgcaaaaaactcag 
2687 

45 I I I I I I I I I I I I I I M I I i M I I I I I I I I I I I I I I I I I I I I i M I I I I I I M I I I I I I I I 

SECR : 1441 ctgctcagctcagaggggaaacagggagtgtgtcccaaagctcgcatgcaaaaaactcag 
1500 

50 NOVl : 2688 gcaagctgacattcaagccgaaaggacctgttctcatgaggcaaagccaacctccctcaa 
2747 

I I I I I I I I I I I I I I I I I I i M I I I I I I t M I I I I t I I I M I I M ( i I I M I I I I I M I M 
SECR : 1501 gcaagctgacattcaagccgaaaggacctgttctcatgaggcaaagccaacctccctcaa 
1560 

55 



NOVl : 2748 tttcaf ttaataaaacaataaattccaggattggaaatacagtatacattacaaaaagga 
2807 

I I I I I I i I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I t I I I I t 
60 SECR : 1561 tttcatttaataaaacaataaattccaggattggaaatacagtatacattacaaaaagga 
1620 
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NOVl: 
2867 

SECR 
1680 



2808 cagaggtcatcaatatactgtgtgaccttattacccccagtgaggccacatatacatgga 

lllillllllllllMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 
1621 cagaggtcatcaatatactgtgtgaccttattacccccagtgaggccacatatacatgga 



10 



NOVl: 
2927 

SECR 
1740 



28 68 ccaaggatggaaccttgttacagccctcagtaaaaataattttggatggaactgggaaga 

I I M I I I i i i f 1 I I I I I I I I I I I I I I I I i I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
1681 ccaaggatggaaccttgttacagccctcagtaaaaataattttggatggaactgggaaga 



15 



20 



NOVl: 
2987 

SECR 
1800 



2928 tacagatacagaatcctacaaggaaagaacaaggcatatatgaatgttctgtagctaatc 

I I I t I I 1 I 1 I I I I I I I I i I I I I I I I 1 I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1741 tacagatacagaatcctacaaggaaagaacaaggcatatatgaatgttctgtagctaatc 



25 



NOVl: 
3047 

SECR 
1860 



2988 atcttggttcagatgtggaaagttcttctgtgctgtatgcagaggcacctgtcatcttgt 

I M I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I 
1801 atcttggttcagatgtggaaagttcttctgtgctgtatgcagaggcacctgtcatcttgt 



30 



NOVl: 
3107 

SECR 
1920 



3048 



ctgttgaaagaaatatcaccaaaccagagcacaaccatctgtctgttgtggttggaggca 



I I I I I M I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I 1 I I I I I I 1 I I I I I I I I ) I I I I I 
1861 ctgttgaaagaaatatcaccaaaccagagcacaaccatctgtctgttgtggttggaggca 



35 



40 



NOVl: 
3167 

SECR 
1980 



3108 tcgtggaggcagcccttggagcaaacgtgacaatccgatgtcctgtaaaaggtgtccctc 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1921 tcgtggaggcagcccttggagcaaacgtgacaatccgatgtcctgtaaaaggtgtccctc 



45 



NOVl: 
3227 

SECR 
2040 



3168 agcctaatataacttggttgaagagaggaggatctctgagtggcaatgtttccttgcttt 

I M I M M M I I M I I I I 1 I I I I I I I I I 1 i I M M I I I I 1 1 I I i I M 1 I I I I t i I I i I I I 
1981 agcctaatataacttggttgaagagaggaggatctctgagtggcaatgtttccttgcttt 



50 



55 



NOVl: 
3287 

SECR 
2100 



3228 tcaatggatccctgttgttgcagaatgtttcccttgaaaatgaaggaacctacgtctgca 

I M I I I I I M I I I I I I M I I I I I I I I I I I I I i I i I 1 1 I I 1 I 1 t I I I I I I I I I I I I I I I I I 
2041 tcaatggatccctgttgttgcagaatgtttcccttgaaaatgaaggaacctacgtctgca 



60 



NOVl: 
3347 

SECR 
2160 



3288 tagccaccaatgctcttggaaaggcagtggcaacatctgtactccacttgctggaacgaa 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I 1 M I I I I I I I I I I I I I I I I I I I I I M I I I I I 
: 2101 tagccaccaatgctcttggaaaggcagtggcaacatctgtactccacttgctggaacgaa 
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NOVl: 
3407 

SECR 
2220 



3348 gatggccagagagtagaatcgtatttctgcaaggacataaaaagtacattctccaggcaa 

I I I I I I I I I I I I I I I I I I M I I M I I I M I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 
2161 gatggccagagagtagaatcgtatttctgcaaggacataaaaagtacattctccaggcaa 



10 



NOVl: 
34 67 

SECR 
2280 



34 08 ccaacactagaaccaacagcaatgacccaacaggagaacccccgcctcaagagccttttt 

I I I I I I I I I I I I I I I I I t I I I I I I I M I I I 1 I I I I I I I I I i I I I I I I I I I I I I I I I I I I I 
2221 ccaacactagaaccaacagcaatgacccaacaggagaacccccgcctcaagagccttttt 



15 



20 



NOVl: 
3527 

SECR 
2340 



34 68 gggagcctggtaactggtcacattgttctgccacctgtggtcatttgggagcccgcattc 

I I I I I M I I I I I I I I 1 I I I t I M I I I I I M I I I I I I I M I I I I I I I I M M I I I I I I I I I 
2281 gggagcctggtaactggtcacattgttctgccacctgtggtcatttgggagcccgcattc 



25 



NOVl: 
3587 

SECR 
2400 



3528 agagaccccagtgtgtgatggccaatgggcaggaagtgagtgaggccctgtgtgatcacc 

I I I I M 1 I I 1 I I ! I M I I I I I I i I I I I I I I M i I I I 11 I I I ! I I I I I i I I I I I I I I I I I I 
2341 agagaccccagtgtgtgatggccaatgggcaggaagtgagtgaggccctgtgtgatcacc. 



30 



35 



40 



NOVl: 
3647 

SECR. 
2460 



NOVl: 
3707 

SECR 
2520 



3588 tccagaagccactggctgggtttgagccctgtaacatccgggactgcccagcgaggtggt 

, I I I I I I I I I i I M I I I I I ( I I II i I I I I ( I I I I I I M i I { I I M I M I i ( i I I I I I i I (I 
240i tccagaagccactggctgggtttgagccctgtaacatccgggactgcccagcgaggtggt 



3648 tcacaagtgtgtggtcacagtgctctgtgtcttgcggtgaaggataccacagtcggcagg 

I I I II I I I M I I I I II I I II I I I I I I M II I I I II I M I I I I II II I I I I I I I II M II i 
24 61 tcacaagtgtgtggtcacagtgctctgtgtcttgcggtgaaggataccacagtcggcagg 



45 



NOVl : 
3767 

SECR 
2580 



3708 tgacgtgcaagcggacaaaagccaatggaactgtgcaggtggtgtctccaagagcatgtg 

I I I II 11 I II I I i I I I li I I II I I II I I II I II II II II i t I II I I I II II 11 I I I M II 
2521 tgacgtgcaagcggacaaaagccaatggaactgtgcaggtggtgtctccaagagcatgtg 



50 



55 



NOVl: 
3827 

SECR 
2640 



37 68 cccctaaagaccggcctctgggaagaaaaccatgttttggtcatccatgtgttcagtggg 

I I I I I I I I I I I 11 I I II I I I II I II I I II I I I I I I I I I I I 1 I M I II I I I I I I I I I I M I 
2581 cccctaaagaccggcctctgggaagaaaaccatgttttggtcatccatgtgttcagtggg 



60 



NOVl: 
3887 

SECR 
2700 



3828 aaccagggaaccggtgtcctggacgttgcatgggccgtgctgtgaggatgcagcagcgtc 

I I I 11 I I II M I I I I I I I 1 I I I I I II I I I II I M I I I t I I I I I I I I I II i I I I I I I I I I I 
2641 aaccagggaaccggtgtcctggacgttgcatgggccgtgctgtgaggatgcagcagcgtc 



12 
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NOVl : 3888 acacagcttgtcaacacaacagctctgactccaactgtgatgacagaaagagacccacct 
3947 

I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
SECR : 2701 acacagcttgtcaacacaacagctctgactccaactgtgatgacagaaagagacccacct 
2760 



NOVl : 394 8 taagaaggaactgcacatcaggggcctgtgatgtgtgttggcacacaggcccttggaagc 
4007 

M I I I I I I I 1 I I I I I I I I I M I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
SECR : 2761 taagaaggaactgcacatcaggggcctgtgatgtgtgttggcacacaggcccttggaagc 
2820 



NOVl : 4 008 cctgtacagcagcctgtggcaggggtttccagtctcggaaagtcgactgtatccacacaa 
4067 

I I I I I 1 I 1 I I I I I 1 I I I I I I I I I I I M I I M I i I I I I I I I I I I I M I I I I I I I I I I I I I I 
SECR : 2821 cctgtacagcagcctgtggcaggggtttccagtctcggaaagtcgactgtatccacacaa 
2880 



NOVl : 4068 ggagttgcaaacctgtggccaagagacactgtgtacagaaaaagaaaccaatttcctggc 
4127 

I I M I I I I I I I M I I M I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M 
SECR : 2881 ggagttgcaaacctgtggccaagagacactgtgtacagaaaaagaaaccaatttcctggc 
2940 



NOVl : 4 128 ggcactgtct tgggccct cctgtgatagagactgcacagacacaact cactactgtatgt 
4187 

1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 n 1 I I I I I i I I I I M I I I I I I I I I 1 I I I 

SECR : 2941 ggcactgtcttgggccct cctgtgatagagactgcacagacacaact cactactgtatgt 
3000 



NOVl : 4188 ttgtaaaacatcttaatttgtgttctctagaccgctacaaacaaaggtgctgccagtcat 
4247 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
SECR : 3001 ttgtaaaacatcttaatttgtgttctctagaccgctacaaacaaaggtgctgccagtcat 
3060 



NOVl : 424 8 gt caagaggga taaacct tt ggaggggtcatgatgct gctgtgaagataaaagtagaata 
4307 

I M M M I I f I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 
SECR ; 3061 gtcaagagggataaaccttt ggaggggtcatgatgct gctgtgaagataaaagtagaata 
3120 



NOVl : 4308 taaaagctcttttccccatgtcgctgattcaaaaacatgtatttcttaaaagactagatt 
4367 

M M I I I I I I I I I I I I M M I I I I I I I I I I I I I I I I i I I I I I M I I I M .1 i I I I I I I I I I 
SECR : 3121 taaaagctcttttccccatgtcgctgattcaaaaacatgtatttcttaaaagactagatt 
3180 



NOVl : 4368 ctatggatcaaacagaggttgatgcaaaaacaccactgttaaggtgtaaagtgaaatttt 
4427 

M I I I I M I I I I I I M I I I I i I I ) I I I I I I I M I I I M I I I M I I I I I I M I I I I I M I I 
SECR : 3181 ctatggatcaaacagaggttgatgcaaaaacaccactgttaaggtgtaaagtgaaatttt 
3240 



13 



wo 01/62928 



PCT/USOl/06151 



NOVl: 
4487 

SECR 
3300 



4 428 ccaatggtagttttatattccaattttttaaaatgatgtattcaaggatgaacaaaatac 

I I I I I I I I I I I I I I I I I I I I I I I I I I I i I M I I I I I I I I I I M I I I I I I I M I I I I I I I I 
3241 ccaatggtagttttatattccaattttttaaaatgatgtattcaaggatgaacaaaatac 



10 



NOVl: 
4547 

SECR 
3360 



4 488 tatagcatgcatgccactgcacttgggacctcatcatgtcagttgaatcgagaaatcacc 

I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
3301 tatagcatgcatgccactgcacttgggacctcatcatgtcagttgaatcgagaaatcacc 



15 



20 



25 



NOVl: 
4607 

SECR 
3420 



NOVl: 
4 667 

SECR 
3480 



4548 aagattatgagtgcatcctcacgtgctgcctctttcctgtgatatgtagactagcacaga 

I 1 I I I I I I I I I I I I I I I I I I I I I t I I I I I I I 1 I I M I I I I I I I I I I I I I I I I I I I I I I I I 
3361 aagattatgagtgcatcctcacgtgctgcctctttcctgtgatatgtagactagcacaga 



4 608 gtggtacatcctaaaaacttgggaaacacagcaacccatgacttcctcttctctcaagtt 

M I I I I I 1 I I I I I I I I I I I I I I I I I I I I M M M I I I I I I I I I I I 11 I I I I I I I I I I I I I 
3421 gtggtacatcctaaaaacttgggaaacacagcaacccatgacttcctcttctctcaagtt 



30 



35 



40 



NOVl: 
4727 

SECR 
3540 



NOVl: 
4787 

SECR 
3600 



4 668 gcaggttttcaacagttttataaggtatttgcattttagaagctctggccagtagttgtt 

I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I M I I I 11 I I I I I I I I I I I I 
34 81 gcaggttttcaacagttttataaggtatttgcattttagaagctctggccagtagttgtt 



4728 aagatgttggcattaatggcattttcatagatccttggtttagtctgtgaaaaagaaacc 

I i I I I I I I I I I I I I I I 1 I I I I I I I i I M I I I I I I I I I I I I I I I I I I I I I I I 1 I M 1 ) I 1 I 
3541 aagatgttggcattaatggcattttcatagatccttggtttagtctgtgaaaaagaaacc 



45 



NOVl : 
4847 

SECR 
3660 



4 788 atctctctggataggctgtcacactgactgacctaagggttcatggaagcatggcatctt 



3601 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I i I I I I I I I I I i I I M I M 
atctctctggataggctgtcacactgactgacctaagggttcatggaagcatggcatctt 



50 



55 



NOVl: 
4907 

SECR 
3720 



4848 gtccttgcttttagaacacccatggaagaaaacacagagtagatattgctgtcatttata 

M I I I I t I I i i I I I I M 1 I I I i I I I I I (1 I 1 I I I I I M I I i I I I I I I I I I I M I M I I I I 
3661 gtccttgcttttagaacacccatggaagaaaacacagagtagatattgctgtcatttata 



60 



NOVl: 
4967 

SECR 
3780 



4 908 caactacagaaatttatctatgacctaatgaggcatctcggaagtcaaagaagagggaaa 

I I I I I I I M I I I I I I I I I I I I I I I M I I I I I I i I I I I I M I I M I I I I I I I I I I I I I I I I 
3721 caactacagaaatttatctatgacctaatgaggcatctcggaagtcaaagaagagggaaa 



14 



BNSOCX^ID: <WQ Oie292eA2 l_> 



wo 01/62928 



PCT/US61/06151 



NOVl: 
5027 

SECR 
3840 



4 968 gttaaccttttctactgatttcgtagtatattcagagctttcttttaagagctgtgaatg 

I I I I M I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3781 gttaaccttttctactgatttcgtagtatattcagagctttcttttaagagctgtgaatg 



10 



NOVl: 
5087 

SECR 
3900 



5028 aaactttttctaagcactattctattgcacacaaacagaaaaccaaagccttattagacc 

I I I I I I I M I I I M 1 I I I I I I M I I I I I I I I M I I I I M I I I I I i I I I I I I I I I I I I M I 
3841 aaactttttctaagcactattctattgcacacaaacagaaaaccaaagccttattagacc 



15 



20 



NOVl : 
5147 

SECR 
3960 



5088 taatttatgcataaagtagtattcctgagaactttattttggaaaatttataagaaagta 

I I I I I I I I I I i I I I 1 I 1 I I I I I 1 1 I I I I I i M I I I [ I M I M I I I I I I I M I I I I I I M I 
3901 taatttatgcataaagtagtattcctgagaactttattttggaaaatttataagaaagta 



25 



NOVl: 
5207 

SECR 
4020 



5148 atccaaataagaaacacgatagttgaaaataatttttatagtaaataattgttttgggct 

I I I I I M I I I I i I I I I I I I M I I I I I I I I I I I I I I I I I I I I I M I M I I I I I I I I I I I I I 
3961 atccaaataagaaacacgatagttgaaaataatttttatagtaaataattgttttgggct 



30 



NOVl : 
5267 

SECR 
4080 



5208 gatttttcagtaaatccaaagtgacttaggttagaagttacactaaggaccaggggttgg 

M I I I I I I I I I t I I I I I I I I I t I I I I I 1 I I I I I I I I I I I I I I I I I I M I I I t I 1 I I I I 1 I 
4 021 gatttttcagtaaatccaaagtgacttaggttagaagttacactaaggaccaggggttgg 



35 



40 



NOVl: 
5327 

SECR 
4140 



5268 aatcagaatttagtttaagatttgaggaaaagggtaagggttagtttcagttttaggatt 

I M 1 I I I i I I I I i I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
4 081 aatcagaatttagtttaagatttgaggaaaagggtaagggttagtttcagttttaggatt 



45 



NOVl: 
5387 

SECR 
4200 



5328 agagctagaattgggttaggtgagaaagaaagttaaggttaaggctagagttgtctttaa 

I I I I M I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 1 I I M I I I I I I I I I I I I I I I I I 
4141 agagctagaattgggttaggtgagaaagaaagttaaggttaaggctagagttgtctttaa 



50 



55 



NOVl: 
5447 

SECR 
4260 



5388 gggttagggttaggaccaggttaggtcagggttggattgggtttagattggggccagtgc 

I 1 I I I I I M M I I I I I I I I I I I I I I I I I I 1 1 I M I I i I I I I I I I I I I I I I I I I I I I I I I I 
4201 gggttagggttaggaccaggttaggtcagggttggattgggtttagattggggccagtgc 



60 



NOVl: 
5507 

SECR 
4320 



5448 t ggt gtt agt gat agtgt caggat ggaggtt aggtttggagt aagcgttgttgctgaagt 

M I I I 1 I I I I I I 1 i I 1 I 1 I I I I I I I I I I I I I 1 1 I 1 t I t I I 1 I I I 1 I I I I I I I 1 I I I I I I I 
4261 tggtgttagtgatagtgtcaggatggaggttaggtttggagtaagcgttgttgctgaagt 



15 



BNSOOCID: <WO_0162928A2_I,> 



rWO 01/62928 



PCT/USOl/06151 



NOVl : 5508 gagttcaggctagcattaaattgtaagttctgaagctgatttggttatggggtctttccc 
5567 

I 1 I I I } I 1 1 I I I I I I ) I ) I I I I M I I I I I I 1 I I I I I I I I M I I I M I I I M I I I I I I I I I 
SECR : 4321 gagttcaggctagcattaaattgtaagttctgaagctgatttggttatggggtctttccc 
5 4380 



NOVl : 5568 ctgtatactaccagttgtgtctttagatggcacacaagtccaaataagtggtcatacttc 
5627 

10 I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 11 I I I I I I I I I I I I t I I I I I I I I I I I I I 

SECR : 4381 ctgtatactaccagttgt'gtctttagatggcacacaagtccaaataagtggtcatacttc 
4440 



15 NOVl : 5628 tttattcagggtctcagctgcctgtacacctgctgcctacatcttcttggcaacaaagtt 
5687 

I I I I I I I I I I I I I I I I 1 I I I I 1 I I I I I I I I I I I I I 1 i I I I I I I I I I I I i I I i M I I I i I I 
SECR : 4 4 41 tttattcagggtctcagctgcctgtacacctgctgcctacatctt cttggcaacaaagtt 



20 



4500 



NOVl : 5 688 acctgccacaggctctgctgagcctagttcctggtcagtaataactgaacagtgcatttt 
5747 

I I i i I I I i I I I M M I I I I I I I I I I I I M I I I I 1 I I I I I I I I I M I I I t I I M I i i I I I I 
25 SECR : 4 501 acctgccacaggctctgctgagcctagttcctggtcagtaataactgaacagtgcatttt 
4560 



NOVl : 5748 ggctttggatgtgtctgtggacaagcttgctgagtttctctaccatattctgagcacacg 
30 5807 

I I I I I I 1 I I I I I 1 I I M I I 1 I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I 1 I I I I I I I 
SECR : 4 561 ggctttggatgtgtctgtggacaagcttgctgagtttctctaccatattctgagcacaicg 
4620 

35 

NOVl: 5808 gtctcttttgttctaatttcagcttcactgacactgggttgagcactactgtatgtggag 
5867 

I I I I I I I 1 I I I I i I I I I i I I I I I I I 1 I I I I 1 I I I I I i I I I I I I I I I I I I I I t I I I I I t I I 
SECR : 4 621 gtctcttttgttctaatttcagcttcactgacactgggttgagcactactgtatgtggag 
40 4680 



45 



NOVl : 
5927 

SECR 
4740 



5868 ggtttggtgattgggaatggatgggggacagtgaggaggacacaccagcccattagttgt 

I I I I i I I I I 1 I I I I I M I I I I I I I i I I I I I I M I 1 I I I I I I I I I I I I I I M I I M I I I I I 
4 681 ggtttggtgattgggaatggatgggggacagtgaggaggacacaccagcccattagttgt 



50 



55 



NOVl: 
5987 

SECR 
4800 



5928 taatcatcaatcacatctgattgttgaaggttattaaattaaaagaaagatcatttgtaa 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I It I I I I 
4741 taatcatcaatcacatctgattgttgaaggttattaaattaaaagaaagatcatttgtaa 



NOVl : 5988 catactctttgtatatatttattatatgaaaggtgcaatattttattttgtacagtatgt 
6047 

I I M I I I i I I I I I I i I M M I I I I I I I I I I I I I M I I I I I M I I I I I I I I i I I I I I i M I 
60 SECR : 4801 catactctttgtatatatttattatatgaaaggtgcaatattttattttgtacagtatgt 
4860 



16 



BNSDCX:iD: <WO 0162928A2.I.> 



wo 01/62928 
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NOVl : 
6107 

SECR 
4920 



604 8 aataaagacatgggacatatatttttcttattaacaaaatttcatattaaattgcttcac 

I M I I I I I I I I I j I I I M I I 1 .1 I I I I I I M I I I I I I I I I I M I I M I I I I I I M I i I I j I 
4861 aataaagacatgggacatatatttttcttattaacaaaatttcatattaaattgcttcac 



10 



NOVl: 
6167 

SECR 
4980 



6108 tttgtatttaaagttaaaagttactatttttcatttgctattgtactttcattgttgtca 

I I I I M I I I I I M t I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
4921 tttgtatttaaagttaaaagttactatttttcatttgctattgtactttcattgttgtca 



15 



20 



NOVl: 
6227 

SECR 
5040 



6168 ttcaattgacattcctgtgtactgtattttactactgtttttataacatgagagttaatg 

I I M i M I I I I I I I M I I 1 I I I I I 1 I I I I I I I I I I I I I I I I I I I M I I I I I I I M I I I I I 
4981 ttcaattgacattcctgtgtactgtattttactactgtttttataacatgagagttaatg 



25 



NOVl: 
6287 

SECR 
5100 



6228 tttctgtttcatgatccttatgtaattcagaaataaatttactttgattattcagtggca 

I I I I M I I I I I I I I I I i I I I I I I I I I I I I I I I I I M I 1 1 I I 1 I I 1 I I I I I I I I I I I I I I I 
5041 tttctgtttcatgatccttatgtaattcagaaataaatttactttgattattcagtggca 



30 



NOVl: 



SECR 



6288 tccttat 6294 (SEQ ID NO: 58) 
I I I I I I I 

5101 tccttat 5107 (SEQ ID NO: 24) 



35 



Table 4 

Score = 2027 bits (5253), Expect =0.0 
Identities = 1023/1023 (100%), Positives 



1023/1023 (100%) 



40 



45 



50 



55 



60 



NOVl: 


259 


SECR 


: 1 


NOVl: 


319 


SECR 


: 61 


NOVl : 


379 


SECR 


: 121 


NOVl : 


439 


SECR 


: 181 


NOVl: 


499 


SECR 


: 241 


NOVl: 


559 


SECR 


: 301 



AVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCGVGIQTRDV 
AVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCGVGIQTRDV 60 

YCLHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQNRRVTCRQ 37 8 
YCLHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQNRRVTCRQ 
YCLHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQNRRVTCRQ 120 

LLTDGSFLNIiSDELCQGPKASSHKSCARTDCPPHLAVGDWSKCSVSCGVGIQRRKQVCQR 438 
LLTDGSFLNLSDELCQGPKASSHKSCARTDCPPHLAVGDWSKCSVSCGVGIQRRKQVCQR 



LAAKGRRIPLSEMMCRDLPGLPLVRSCQMPECSKIKSEMKTKLGEQGPQILSVQRVYIQT 



REEKRINLTIGSRAYLLPNTSVIIKCPVRRFQKSLIQWEKDGRCLQNSKRLGITKSGSLK 



IHGLAAPDIGVYRCIAGSAQETWLKLIGTDNRLIARPALREPMREYPGMDHSEANSLGV 



17 



BNSDOCID: <WO_0162928Aa_l_> 





WO 01/62928 


PCT/USOl/06151 






NOVl: 
SECR : 


619 
361 


TWHKMRQMWNNKNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKNKQFEAAVKQGA 678 
TWHKMRQMTrJNNKNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKNKQFEAAVKQGA 
TWHKMRQMWNNKNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKNKQFEAAVKQGA 420 


5 


NOVl: 
SECR : 


679 
421 


YSMDTAQFDELIRNMSQLMETGEVSDDLASQLIYQLVAELAKAQPTHMQWRGIQEETPPA 738 
YSMDTAQFDELIRNMSQLMETGEVSDDLASQLIYQLVAEIiAKAQPTHMQWRGIQEETPPA 
YSMDTAQFDELIRNMSQLMETGEVSDDLASQLI YQLVAELAKAQPTHMQWRGIQEETPPA 480 


10 


NOVl: 
SECR : 


739 
481 


AQLRGETGSVSQSSHAKNSGKLTFKPKGPVLMRQSQPPSISFNKTINSRIGNTVYITKRT 
AQLRGETGSVSQSSHAKNSGKLTFKPKGPVLMRQSQPPSISFNKTINSRIGNTVYITKRT 
AQJjRlaETCabVSQSSrlAKNolsKijTif KFKvjFVijMKyoUlrlroiJsr Wixi XWonxoW i v i x i£vi\i 


798 


15 


NOVl : 
SECR : 


799 
541 


EViNxLiUUijX i FotjAi 1 1 W i KDljl JjX>yFbV 1\X XXi Ut> iijixxy xywrl i^-JXEj^^oX ie*\-.o VMiNn 
EVINILCDLITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTRKEQGIYECSVANH 
EVINILCDLITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTRKEQGIYECSVANH 


o <^ o 
600 


20 


NOVl: 
SECR : 


859 
601 


LGSDVESSSVLYAEAPVILSVERNITKPEHNHLSVWGGIVEAALGANVTIRCPVKGVPQ 
JjGSDVESSSVLiYAEAFVXLioVhiKNIx l\rEnNnljb v v VVaoX VtiMAliijiilN V l ±t\\^cvj\K3\f 
LGSDVESSSVLYAEAPVILSVERNITKPEHNHLSVWGGIVEAALGANVTIRCPVKGVPQ 


918 
660 


NOVl: 

SECR : 


919 

661 


PNXTWxjKRGGSLiSGNVSLiIjI: NGSLiIjIiUNVoXjC/Nciol i VL-XAl JNAXioJ\rtVM.i ovxiniiXiCtrvtv 
PNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGKAVATSVLHLLERR 

T-i-KiTT rpr»7T t/'or^/^OT o/^MTTOT T c^M/^ o T T T /^KTi/CT TMirr'*T*V\7r^T TVTKITVT ^^Tf Ii\721TQ\7T.HT.T.T«*T5T3 
PNITWIjKRGGoXtooNVbL)L»c N(joL>ljijyNVoljt»Nl!joi i vL*XAl IN/4LiolVB.Vrt.l o V XinXiJjll*i\x\ 


17 / O 


25 


NOVl: 
1038 

SECR : 


979 


WPESRIVFLQGHKKYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATCGHLGARIQ. 






721 


WPESRIVFLQGHKKYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATCGHLGARIQ 
WPESRIVFLQGHKKYXLQATNTRTNSNUrTbbrr iryhjr WlLrijlNWoriUo/ii UtanXjlaArNXy 


f o u 


30 


NOVl: 
1098 

SECR : 


1039 


RPQCVMANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFTSVWSQCSVSCGEGYHSRQV 






781 


RPQCVMANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFTSVWSQCSVSCGEGYHSRQV 
RPQCV^lANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFToVwbUL.b VbUt^ 


n 4 n 
o *t u 


35 


NOVl: 
1158 

SECR : 


1099 


TCKRTKANGTVQWSPRACAPKDRPLGRKPCFGHPCVQWEPGNRCPGRCMGRAVRMQQRH 






841 


TCKRTKANGTVQWSPRACAPKDRPLGRKPCFGHPCVQWEPGNRCPGRCMGRAVRMQQRH 
TCKRTKANGTVQWSPRACAPKDRPLGRKPCFGHPCVQWEPGNRCPGRCMGRAVRMQQRH 


900 


40 


NOVl: 
1218 

'^SECR : 


1159 


TACQHNSSDSNCDDRKRPTLRRNCTSGACDVCWHTGPWKPCTAACGRGFQSRKVDCIHTR 






901 


TACQHNSSDSNCDDRKRPTLRRNCTSGACDVCWHTGPWKPCTAACGRGFQSRKVDCIHTR 
TACQHNSSDSNCDDRKRPTLRRNCTSGACDVCWHTGPWKPCTAACGRGFQSRKVDCIHTR 


960 


45 


NOVl: 
1278 

SECR : 
1020 


1219 


SCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRYKQRCCQSC 






961 


SCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRYKQRCCQSC 
SCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRYKQRCCQSC 




50 


NOVl: 


1279 


QEG 1281 (SEQ ID NO: 59) 
QEG 

QEG 1023 (SEQ ID NO: 25) 






SECR : 


1021 





55 

Based the relatedness of NOV-1 to KIAA1233 sequences, which are related to lacunin, 
thrombospondins, proteinases, semaphorins, ADAM-TS and properdin family members, 
nucleic acids and proteins according to the invention likely have similar functions as proteins 
belonging to these famiUes. Thus, the NOV-1 of the invention is implicated in the following 
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diseases and processes and has therapeutic uses in these diseases and processes: (i) 
inflammation, (ii) cancer, (iii) neuronal development and axonal guidance, (iv) angiogenesis 
and vasculogenesis — in cancer as well as for ischemia, and (v) tissue regeneration in vivo and 
/// vitro, (vi) and other diseases and disorders. 
5 Functional roles attributed to this family of proteins include cell attachment, spreading, 

motility, and proliferation, cytoskeletal organization, wound healing, and angiogenesis. 
Moreover, these proteins are expressed in the nervous systems during development and are 
thought to play roles in neuronal grov^h and patterning. In particular, the thrombospondin, 
METH-1 and ADAMTS families of proteins are potent inhibitors of angiogenesis. The 
10 ADAMTS proteins have also been implicated in cleavage of proteglycans and the control of 
organ shape during development. In addition, the thrombospondins have been implicated in 
the activation of both transforming growth factor-beta (TGF-p) precursors and TGF-P in a 
variety of disease states. Furthermore, semaphorin proteins have shown expression in 
undifferentiated neuroepithelium, suggesting that these proteins are actors in axonal guidance. 

15 

NOV 2: A Novel KIAA1233-like Protein 

The NOV-2 sequences according to the invention include nucleotide sequences 
encoding a polypeptide related to KIAA1233 proteins, which bear sequence similarity to 
lacunin, thrombospondins, proteinases, semaphorins, ADAM-TS, and properdm family 
20 members. 

NOV2a and NOV2b are splice variants. Splice variants are sequences that occur 
naturally within the cells and tissues of individuals. The physiological activity of splice variant 
products and the original protein, from which they are varied, may be the same (although 
perhaps at a different level), opposite, or completely different and uiurelated. In addition, 

25 variants may have no activity at all. When a variant and the original sequence have the same 
or opposite activity, they may differ in various properties not directly connected to biological 
activity, such as stability, clearance rate, tissue and cellular localization, temporal pattern of 
expression, up or down regulation mechanisms, and responses to agonists or antagonists. The 
presence or level of specific splice variants may be the cause, and/or indicative of, a disease, 

30 disorder, pathological or normal condition. 

Because a drug may be effective against one variant but not another, or may cause side 
effects because it targets all splice variants, an effective drug needs to target the particular 
splice variant. Because soluble variants with therapeutic or disease-related functions may be 
naturally occurring in specific tissues, they may be optimal candidates for drug targets or 
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protein therapeutics. Variants may have no activity at all and may thus serve as dominant 
negative natural inhibitors. Thus, splice variants useful in generating new drug targets, protein 
therapeutics and markers for diagnostics. 

NOV-2 maps to Unigene cluster Hs.l8705. This cluster has been mapped to 
5 Chromosome 15 Marker stSG35204, Interval D15S1 15-0158152. By integrating information 
from the Online Mendelian Inheritance in Man (OMIM), this region is identified as 15q22- 
qter. Therefore, the chromosomal location of the invention is Chromosome 15 Marker 
stSG35204, Interval D15S115-D15S152 which corresponds to 15q22-qter. 

10 NOV-2a 

A NOV-2a nucleic acid of the invention, encoding a KIAA1233-like protein 
originating from chromosome 15 is shown in TABLE 5. The disclosed nucleic acid (SEQ ID 
NO: 3) is 7260 nucleotides and contains an open reading frame (ORF) that begins with an 
ATG initiation codon at nucleotide 136 and ends with a TAA stop codon at nucleotides 5209. 

15 The representative ORF encodes a 1691 amino acid polypeptide (SEQ ID NO: 4). The 
initiation and stop codons of SEQ ID NO: 3 are shown in bold font. The protein has a 
predicted molecular weight of 188743.8 daltons. Putative untranslated regions are upstream of 
the initiation codon and downstream of the stop codon in SEQ ID NO: 3. 

20 TABLES 

CGCACGAGGTGTTGACGGGCGGCTTCTGCCAACTTCTCCCCAGCGCGCGCCGAGCCCGCGCGGCCCCGGGGCTGCACGTC 
CCAGATACTTCTGCGGCGCAAGGCTACAACTGAGACCCGGAGGAGACTAGACCCCATGGCTTCCTGGACGAGCCCCTGGT 
GGGTGCTGATAGGGATGGTCTTCATGCACTCTCCCCTCCCGCAGACCACAGCTGAGAAATCTCCTGGAGCCTATTTCCTT 
CCCGAGTTTGCACTTTCTCCTCAGGGAAGTTTTCTGGAAGACACAACAGGGGAGCAGTTCCTCACTTATCGCTATGATGA 
25 CCAGACCTCAAGAAACACTCGTTCAGATGAAGACAAAGATGGCAACTGGGATGCTTGGGGCGACTGGAGTGACTGCTCCC 
GGACCTGTGGGGGAGGAGCATCATATTCTCTGCGGAGATGTTTGACTGGAAGGAATTGTGAAGGGCAGAACATTCGGTAC 
AAGACATGCAGCAATCT^TGACTGCCCTCCAGATGCAGAAGATTTCAGAGCCCAGCAGTGCTC^^ 
GTATCAGGGGCATTACTATGAATGGCTTCCACGATATAATGATCCTGCTGCCCCGTGTGCACTC^ 
GACAAAACTTGGTGGTGGAGCTGGCACCTAAGGTACTGGATGGAACTCGTTGCAACACGGACTCCTTGGACA^ 

30 AGTGGCATCTGTCAGGCa^GTGGGCTGCGATCGGCAACTGGGAAGCAATGCCAAGGAGGACAAC 

CGATGGCTCCACCTGCAGGCTTGTACGGGGACAATCAAAGTCACACGTTTCTCCTGAAAAAAGAGAAGAAAATGTAATTG 
CTGTTCCTTTGGGAAGTCGAAGTGTGAGAATTACAGTGAAAGGACCTGCCCACCTCTTTATTGAATCAAATACACTTCAA 
GGAAGCAAAGGAGAACACAGCTTTAACAGCCCCGGCGTCTTTGTCGTAGAAAACACAACAGTGGAATO^ 
CGAGAGGCAAACTTTTAAGATTCCAGGACCTCTGATGGCTGATTTCATCTTCyVAGAC^^ 

35 GCGTGGTTCAGTTCTTCTTTTACCAGCCCATCAGTOVTaVGTGGAGACAAACTGACTTCT^^ 

GGAGGAGGTTATCAGCTCAATTCTGCTGAATGTGTGGATATCCGCTTGAAGAGGGTAGTTCCTGACCATTATTGTCAC 
CTACCCTGAAAATGTAAAACCAAAACCAAAACTGAAGGAATGCAGCATGGATCCCTGCC^^ 

AGATAATGCCCTATGACCACTTCCAACCTCTTCCTCGCTGGGAACATAATCCTTGGACTGCATGTTCCGTGTCCTGTGG^ 

20 
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GGAGGGATTOIGAGACGGAGCTTTGTGTGTGTAGAGGAATCCATGCATGGAGAGATATTGCAGGTC^ 
CATGTACGCACCCAAACCCAAGGTTATGC^AACTTGTAATCTGTTTGATTGCCCCAAGTGGATTGCCAT^ 
AGTGC:ACAGTGACTTGTGGCCGAGGGTTAa3GTACCGGGTTGTTCTGTGTATTAACCACCGCGGAGAGCATGT^^ 

tgct^tccacaactgaagttacacatcaaagaagaatgtgtcattcccatccostgttataaaccaaaag;^^ 
5 agtggaagcaaaattgccttggctgaaacaagcacaagaactagaagagaccagaatagcaacagaagaacc^ 

ttccagaaccctggtcagcctgcagtaccacgtgtgggccgggtgtgcaggtccgtgaggtgaagtgccgtgtgctcctc 
acattcacgcagactgagactgagctgcccgaggaagagtgtgaaggccccaagctgcccaccgaacggccctgcctcct 
ggaagcatgtgatgagagcccggcctcccqagagctagactvtccctctccctgaggacagtgagaasactt^ 
agtacgctgggttca.ccccttgc7^cagcaacatgcgtgggaggccata^gaagc(^tagc^ 

10 

ACCCAGCAGACAGTCAATGACAGCTTGTGTGATATGGTCCACCGTCCTCCAGCCATGAGCCAGGCCTGTAACACAGAGCC 
CTGTCCCCCCAGGTGGCATGTGGGCTCTTGGGGGCCCTGCTCAGCTACCTGTGGAGTTGGAATTCAGACCCGAGATGTGT 
ACTGCCTGCACCCAGGGGAGACCCCTGCCCCTCCTGA6GAGTGCCGAGATGA7VAAGCCCCATGCTTTACAAGCATGCAAT 
CAGTTTGACTGCCCTCCTGGCTGGCACATTGAAGAATGGCAGCAGTGTTCCAGGACTTGTGGCGGGGGAACTCAGAACAG 
AAGAGTCACCTGTCGGCAGCTGCTAACGGATGGCAGCTTTTTGAATCTCTCAGATG7UVTTGTGCCAAGGACCCAAGGCAT 
15 CGTCTCACAAGTCCTGTGCCAGGACAGACTGTCCTCCACATTTAGCTGTGGGAGACTGGTCGAAGTGTTCTGTCAGTT6T 
GGTGTTGGAATCCAGAGAAGAAAGCAGGTGTGTCAAAGGCTGGCAGCCAT^GGTCGGCGCATCCCCCTC^ 
GTGCAGGGATCTACCAGGGTTCCCTCTTGTAAGATCTTGCCAGATGCCTGAGTGCAGTAAAATCMU\TCAGAGATGAAGA 
CAAAACTTGGTGAGCAGGGTCCGCAGATCCTCAGTGTCCAGAGAGTCTACATTCAGACAAGGGAAGAGAAGCGTATTAAC 
CTGACCATTGGTAGCAGAGCCTATTTGCTGCCCAACACATCCGTGATTATTAAGTGCCCCGTGCGACGATTCCAGAAATC 

20 tctgatccagtgggagaaggatggccgttgcctgcagaactccaaacggcttggcatcaccaagtcaggctcactaaa;^ 

TCCACGGTCTTGCTGCCCCCGACATCGGCGTGTACCGGTGCATTGCAGGCTCTGCACAGGAAACAGTTGTGCTCAAGCTC 
ATTGGTACTGAC?^CCGGCTCATCGCACGCCC7^GCCCTCAGGGAGCCTATGAGGGAATATCCTGGGATGGACCACAGC^ 
AGCCAATAGTTTGGGAGTCACATGGCACAAAATGAGGCAAATGTGGAATAACAAAAATGACCTTTATCTGGATGATGACC 

acattagtaaccagcctttcttgagagctctgttaggccactgcagcaattctgcaggaagcacc^ 
25 aagaataagcagtttgaagcagcagttaaacaaggagcyvtatagcatggatacagcccagtttgatgagctgataagt^aa 
catgagtcagctcatggaaaccggagaggtcagcgatgatcttgcgtcccagctgatatatcagctggtggccgaattag 

CCAAGGCACAGCCAACACACATGCAGTGGCGGGGCATCCAGGAAGAGACACCTCCTGCTGCTCAGCTCAGAGGGGAAACA 
GGGAGTGTGTCCCAAAGCTCGCATGCAAAAAACTCAGGCAAGCTGACATTCAAGCCGT^AAGGACCTGTTCTCATGAGGCA 
AAGCCy^CCTCCCTCAATTTCATTTAATAAAAC?^TAAATTCCAGGATTGGAAATA^ 
30 AGGTCATCAATATACTGTGTGACCTTATTACCCCCAGTGAGGCCACATATACATGGACCAAGGATGGAACCTTGTTAC^ 
CCCTCAGTAAAAATAATTTTGGATGGAACTGGGAAGATACAGATAO^GAATCCTACAAGGAAAGAACAAGGC^ 
ATGTTCTGTAGCTAATCATCTTGGTTCAGATGTGGAAAGTTCTTCTGTGCTGTATGCAGAGGCACCTGTCATCTTGTCTG 
TTGAAAGAAATATCACCAAACCAGAGCACAACCATCTGTCTGTTGTGGTTGGAGGCATCGTGGAGGCAGCCCTTGGAGCA 
AACGTGACAATCCGATGTCCTGTAAAAGGTGTCCCTCAGCCTAATATAACTTGGTTGAAGAGAGGAGGATCTCTGAGTGG 

35 

CAATGTTTCCTTGCTTTTCAATGGATCCCTGTTGTTGCAGAATGTTTCCCTTGAAAATGAAGGAACCTACGTCTGCATAG 
CCACCAATGCTCTTGGAAAGGCAGTGGCAACATCTGTATTCCACTTGCTGGAACGAAGATGGCCAGAGAG^^ 
TTTCTGCAAGGACTITAAAAAGTACATTCTCCAGGCAACCAACACTAGAACCAACAGC^ 
GCCTCAAGAGCCTTTTTGGGAGCCTGGTAAOTGGTCACATTGTTCTGCCACCTGTGGTCATTTG^ 

GACCCCAGTGTGTGATGGCCAATGGGCAGGAAGTGAGTGAGGCCCTGTGTGATCACCTCCAGAAGCCACTGGCTGGGTTT 
40 GAGCCCTGTAACATCCGGGACTGCCCAGCGAGGTGGTTCACAAGTGTGTGGTCACAGTGCTCTGTGTCTTGCGG.TGAAGG 
ATACCACAGTCGGCAGGTGACGTGCAAGCGGACy^AAAGCCAATGGAACTGTGCAGGTGGTGTCTCCAAGAGCATGTGCCC 
CTAAAGACCGGCCTCTGGGAAGAAAACCATGTTTTGGTCyVTCCATGTGTTCAGTGGGAACCAGGGAACCGGTGTCCTGGA 
CGTTGCATGGGCCGTGCTGTGAGGATGCAGCAGCGTCAGACAGCTTGTCAACAC^^CAGCTCTGAC 
CAGAAAGAGACCCACCTTAAGAAGGAACTGCACATCAGGGGCCTGTGATGTGTGTTGGCACACAGGCCC^ 
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GTACAGCAGCCTGTGGCAGGGGTTTCCAGTCTCGGAAAGTCGACTGTATCCACACAA 
AGACACTGTGTACAGAAAAAGAAACCAATTTCCl^CGGCACrcTCTTG^ 

AACTCACTACTGTATGTTTGTAAAACATCTTAATTTGTGTTCTCTAGACCGCTACT^AACAAAGGTGCTGCCAGTC^ 
AAGAGGGATAAACCTTTGGAGGGGTCATGATGCTGCTGTGAAGATAAAAGTAGAATATAAAAGCTCTTTTCCCCATGTCG 
5 CTGATTCAAAAACATGTATTTCTTAAAAGACTAGATTCTATGGATCT^CAGAGGTTGATGCAAAAACACCACTGTTAAG 
GTGTAAAGTGAAATTTTCCAATGGTAGTTTTATATTCCAATTTTTTAAAATGATGTATTCAAGGATGAACAAJ^ 
AGCy^TGCATGCCy^CTGCACTTGGGACCTCATCy^TGTCAGTTGAATCGAGAAATCACC;^ 
TGCTGCCTCTTTCCTGTGATATGTAGACTAGCACa^GAGTGGTACATCCTAAAAACTTGGGAAACACAGCyV^ 
TCCTCTTCTCTCAAGTTGCAGGTTTTOUVCAGTTTTATAAGGTATTTGCATTTTAGAAGCTCTGGCCAGTAGTTGTTA^ 

10 ATGTTGGCATTAATGGCATTTTCATAGATCCTTGGTTTAGTCTGTGAAAAAGAAACCATCTCTCTGGATAGGCTGTCAC^ 
CTGACTGACCTAAGGGTTCATGGAAGCATGGCATCTTGTCCTTGCTTTTAGAACACCCATGGAAGAAAACACAGAGTAGA 
TATTGCTGTCATTTATACAACTACAGAAATTTATCTATGACCTAATGAGGCATCTCGGAAGTCAAAGAAGAGGGAAAGTT 
AACCTTTTCTACTGATTTCGTAGTATATTCAGAGCTTTCTTTTAAGAGCTGTGAATGAAACTTTTTCTAAGCACTATTCT 
ATTGCACACAAACAGAAAACCAAAGCCTTATTAGACCTAATTTATGCATAAAGTAGTATTCCTGAGAACTTTATTTTGGA 

1 5 AAATTTATAAGAAAGTAATCCAAATAAGAAACACGATAGTTGAAAATAATTTTTATAGTAAATAATTGTTTTGGGCTQAT 
TTTTCAGTAAATCCAAAGTGACTTAGGTTAGAAGTTACACTAAGGACCAGGGGTTGGAATCAGAATTTAGTTTAAGAT^ 
GAGGAAAAGGGTAAGGGTTAGTTTCAGTTTTAGGATTAGAGCTAGAATTGGGTTAGGTGAGAAAGAAAGTTAAGGTTAAG 
GCTAGAGTTGTCTTTAAGGGTTAGGGTTAGGACCAGGTTAGGTCAGGGTTGGATTGGGTTTAGATTGGGGCCAGTGCTGG 
TGTTAGTGATAGTGTCAGGATGGAGGTTAGGTTTGGAGTAAGCGTTGTTGCTGAAGTGAGTTCAGGCTAGCATTAAATTG 

20 TAAGTTCTGAAGCTGATTTGGTTATGGGGTCTTTCCCCTGTATACTACCAGTTGTGTCTTTAGATGGCACACAAGTCCAA 
ATAAGTGGTCATACTTCTTTATTCAGGGTCTCAGCTGCCTGTACACCTGCTGCCTACATCTTCTTGGCAACAAAGTTACC 
TGCCACAGGCTCTGCTGAGCCTAGTTCCTGGTCAGTAATAACTGAACAGTGCATTTTGGCTTTGGATGTGTCTGTGGACA 
AGCTTGCTGAGTTTCTCTACCATATTCTGAGCACACGGTCTCTTTTGTTCTAACTTCAGCTTCACTGACACTGGGTTGM 
CACTACTGTATGTGGAGGGTTTGGTGATTGGGAATGGATGGGGGACAGTGAGGAGGACACACCAGCCCATTAGTTGTTAA 

25 TCATCAATCACATCTGATTGTTGAAGGTTATTAAATTAAAAGAAAGATCATTTGTAACATACTCTTTGTATATATTTATT 
ATATGAAAGGTGCAATATTTTATTTTGTACAGTATGTAATAAAGACATGGGACATATATTTTTCTTATTAACAAAATTTC 
ATATTAAATTGCTTCACTTTGTATTTAAAGTTAAAAGTTACTATTTTTCATTTGCTATTGTACTTTCATTGTTGTCAT^ 
AATTGACATTCCTGTGTACTGTATTTTACTACTGTTTTTATAACATGAGAGTTAATGTTTCTGTTTCATGATCCTTATO 

AATTCAGAAAT/^TTTACTTTGATTATTCAGTGGCATCCTTATAAAAAAAAAAAAAAAA (SEQ ID NO: 3) 

30 



MASWTSPWWVLIGMVFMHSPLPQTTAEKSPGAYFLPEFALSPQGSFLEDTTGEQFLTYRYDDQTSRNTRSDEDKDG 
NWDAWGDWSDCSRTCGGGASYSLRRCLTGRNCEGQNIRYKTCSNHDCPPDAEDFRAQQCSAYNDVQYQGHYYEWLP 
RYNDPAAPCALKCHAQGQNLVVELAPKVLDGTRCNTDSLDMCISGICQAVGCDRQLGSNAKEDNCGVCAGDGSTCR 

35 LVRGQSKSHVSPEKREENVIAVPLGSRSVRITVKGPAHLFIESKTLQGSKGEHSFNSPGVFWENTTVEFQRGSER 
QTFKIPGPLMADFIFKTRYTAAKDSWQFFFYQPISHQWRQTDFFPCTVTCGGGYQLNSAECVDIRLKRWPDHYC 
HYYPENVKPKPKLKECSMDPCPSSDGFKEIMPYDHFQPLPRWEHNPWTACSVSCGGGIQRRSFVCVEESMHGEILQ 
VEEWKCMYAPKPBCVMQTCNLFDCPKWIAMEWSQCTVTCGRGLRYRVVLCINHRGEHVGGCNPQLKLHIKEECVIPI 
PCYKPKEKSPVEAKLPWLKQAQELEETRIATEEPTFIPEPWSACSTTCGPGVQVREVKCRVLLTFTQTETELPEEE 

40 CEGPKLPTERPCLLEACDESPASRELDIPLPEDSETTYDWEYAGFTPCTATCVGGHQEAIAVCLHIQTQQTVNDSL 
CDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCGVGIQTRDVYCLHPGETPAPPEECRDEKPHALQACNQFDCP 
PGWHIEEWQQCSRTCGGGTQNRRVTCRQLLTDGSFLNLSDELCQGPKASSHKSCARTDCPPHLAVGDWSKCSVSCG 
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VGIQRRKQVCQRLAAKGRRIPLSEMMCRDLPGFPLVRSCQMPECSKIKSEMKTKLGEQGPQILSVQRVYIQTREEK 
RINLTIGSRAYLLPNTSVIIKCPVRRFQKSLIQWEKDGRCLQNSKRLGITKSGSIiKIHGLAAPDIGVYRCIAGSAQ 
ETWLKLIGTDNRLIARPALREPMREYPGMDHSET^SLGVTWHKMRQMWNNKNDLYLDDDHISNQPFLRALLGHCS 
NSAGSTNSWELKNKQFEAAVKQGAYSMDTAQFDELIRNMSQLMETGEVSDDLASQLIYQLVAELAKAQPTHMQWRG 
IQEETPPAAQLRGETGSVSQSSHAKNSGKLTFKPKGPVLMRQSQPPSISFNKTINSRIGNTVYITKRTEVINILCD 
LITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTEIKEQGIYECSVANHLGSDVESSSVLYAEAPVILSVERN 
ITKPEHNHLSVWGGIVEAALGANVTIRCPVKGVPQPNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCI 
ATNALGKAVATSVFHLLERRWPESRIVFLQGHKKYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATCGHLG 
ARIQRPQCVMANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFTSVWSQCSVSCGEGYHSRQVTCKRTKANGTVQ 
WSPRACAPKDRPLGRKPCFGHPCVQWEPGNRCPGRCMGRAVRMQQRHTACQHNSSDSNCDDRKRPTLRRNCTSGA 
CDVCWHTGPWKPCTAACGRGFQSRKVDCIHTRSCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHL 
NLCSLDRYKQRCCQSCQEG (SEQ ID NO: 4) 

In a search of sequence databases, it was found, for example, that the disclosed NOV- 
2a nucleotide sequence has 5104 of 5 107 bases (99%) identical to a human mRNA for a 
KIAA1233 protein (GenBank Accession No: ABO33059), as shown in Table 6. In all 
sequence alignments, identical residues are depicted as "|". As indicated by the "Expect" 
value, the probability of this alignment occurring by chance alone is 0.0, the lowest 
probability. 

Furthemiore, the encoded amino acid sequence has 1023 of 1023 amino acid residues 
(100%) identical to, and 1021 of 1023 residues (100 %) positive with, a 1023 amino acid 
residue human KIAA1233 protein (GenBank Accession No: BAA86547), as shown in Table 
7. As indicated by the "Expect" value, the probability of this alignment occurring by chance 
alone is 0,0, the lowest probability. 

TABLE 6 

Score = 1.010e+04 bits (5095), Expect =0.0 
Identities = 5104/5107 (99%) 
Strand = Plus / Plus 



N0V2a 
2197 



2138 tagcagtgtgcttacatatccagacccagcagacagtcaatgacagcttgtgtgatatgg 



SECR 



1 




N0V2a 
2257 



2198 tccaccgtcctccagccatgagccaggcctgtaacacagagccctgtccccccaggtggc 



SECR 
120 



61 
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N0V2a : 
2317 

SECR 
180 



2258 atgtgggctcttgggggccctgctcagctacctgtggagttggaattcagacccgagatg 



121 



IIIIMIlMlllllMMIIIIlllMllllllllllllllllllllMllllillMI 
atgtgggctcttgggggccctgctcagctacctgtggagttggaattcagacccgagatg 



10 



N0V2a 
2377 

SECR : 
240 



2318 tgtactgcctgcacccaggggagacccctgcccctcctgaggagtgccgagatgaaaagc 



181 



I I I I I I I I I I I I I I I I M I I I I I I M I I I I I M M I M I I M I I i I I I I I I I I I I M I i I 
tgtactgcctgcacccaggggagacccctgcccctcctgaggagtgccgagatgaaaagc 



15 



20 



25 



N0V2a 
2437 

SECR : 
300 



N0V2a 
2497 

SECR : 
360 



2378 cccatgctttacaagcatgcaatcagtttgactgccctcctggctggcacattgaagaat 



241 



I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 1 I i I M M I I I I I I I 
cccatgctttacaagcatgcaatcagtttgactgccctcctggctggcacattgaagaat 



2438 ggcagcagtgttccaggacttgtggcgggggaactcagaacagaagagtcacctgtcggc 



301 



llllllllllllllllllllllllllMMIIMIIIMIiMUIIIIIIMIIMlM 
ggcagcagtgttccaggacttgtggcgggggaactcagaacagaagagtcacctgtcggc 



30 



35 



40 



N0V2a 
2557 

SECR J 
420 



N0V2a 
2617 

SECR : 
480 



24 98 agct gctaacggatggcagcttttt gaat ctctcagatgaattgt gccaaggacccaagg 



361 



N I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I M M I I I i I I I M I I I I I I I It I I i I 
agctgctaacggatggcagctttttgaatctctcagatgaattgtgccaaggacccaagg 



2558 catcgtctcacaagtcctgtgccaggacagactgtcctccacatttagctgtgggagact 



421 



I I I I I I I I I M M I I I i I I I I I I I I I I I I I I I I t 1 M I I I I I f I I I I I i I I I t I I I I I I t 
catcgtctcacaagtcctgtgccaggacagactgtcctccacatttagctgtgggagact 



45 



N0V2a 
2677 

SECR ; 
540 



2618 ggtcgaagtgttctgtcagttgtggtgttggaatccagagaagaaagcaggtgtgtcaaa 



481 



I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I t I I I I i I I I I I I I I I I I I I I I I I M I I M I 
ggtcgaagtgttctgtcagttgtggtgttggaatccagagaagaaagcaggtgtgtcaaa 



50 



55 



60 



N0V2a 
2737 

SECR : 
600 



N0V2a 
2797 

SECR J 
660 



2678 ggctggcagccaaaggtcggcgcatccccctcagtgagatgatgtgcagggatctaccag 



541 



I I I I I I I I I I i I 11 I I I 1 I I I I I I I I 1 I I I I I I I I I ) I t I I i I I I I I I I I I I I I I I I I I I 
ggctggcagccaaaggtcggcgcatccccctcagtgagatgatgtgcagggatctaccag 



27 38 ggttccctcttgtaagatcttgccagatgcctgagtgcagtaaaatcaaatcagagatga 



601 



II lIlllllllllilllllllllttlllllllMllllllMlllllllMIIIIIIM 
ggctccctcttgtaagatcttgccagatgcctgagtgcagtaaaatcaaatcagagatga 
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NOV2a ; 2798 agacaaaacttggtgagcagggtccgcagatcctcagtgtccagagagtctacattcaga 
2857 

NIIMIIilMlllllliliMIIIIIIMIIIIIIIIIIIIIIIIlllllllllllll 
SECR : 661 agacaaaacttggtgagcagggt ccgcagat cct cagtgtccagagagtctacatt caga 



N0V2a : 2858 caagggaagagaagcgtattaacctgaccattggtagcagagcctatttqctqcccaaca 
2917 

IIIIIMMIMIllliMIIIIIMIINIIIIIIIIIIMiiliiiiiMMMiiii 
SECR : 721 caagggaagagaagcgtattaacctgaccattggtagcagagcctatttgctgcccaaca 



15 



20 



N0V2a 
2977 

SECR : 
840 



2918 catccgtgattattaagtgccccgtgcgacgattccagaaatctctgatccagtgggaga 



781 



I i M f N i I I I I I i I I I I I I I I I I I M I I I I ( I I [ I I I f I I I [ I I I I M I I M I I I I I I I 
catccgtgattattaagtgccccgtgcgacgattccagaaatctctgatccagtgggaga 
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2978 aggatggccgttgcctgcagaactccaaacggcttggcatcaccaagtcaggctcactaa 
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aggatggccgttgcctgcagaactccaaacggcttggcatcaccaagtcaggctcactaa 
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1020 



3038 aa 
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atccacggtcttgctgcccccgacatcggcgtgtaccggtgcattgcaggctctgcac 

I I I t I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I 
aaatccacggtcttgctgcccccgacatcggcgtgtaccggtgcattgcaggctctgcac 



3098 aggaaacagttgtgctcaagctcattggtactgacaaccggctcatcgcacgcccagccc 
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I I I I I I I M I I M M I I I I I M I I I I I I I I I I I I M ( I I M I I I I I I I I I i I I I I N I I I 
aggaaacagttgtgctcaagctcattggtactgacaaccggctcatcgcacgcccagccc 
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NOV2a 
3217 

SECR : 
1080 



3158 tcagggagcctatgagggaatatcctgggatggaccacagcgaagccaatagtttgggag 
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tcagggagcctatgagggaatatcctgggatggaccacagcgaagccaatagtttgggag 
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; 3218 tcacatggcacaaaatgaggcaaatgtggaataacaaaaatgacctttatctggatgatg 
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1081 tcacatggcacaaaatgaggcaaatgtggaataacaaaaatgacctttatctggatgatg 
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N0V2a 
3337 

SECR : 
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3278 accacattagtaaccagcctttcttgagagctctgttaggccactgcagcaattctgcag 
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accacattagtaaccagcctttcttgagagctctgttaggccactgcagcaattctgcag 
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; 3338 gaagcaccaactcctgggagttgaagaataagcagtttgaagcagcagttaaacaaggag 

I I I I I I I I I I I t I I I I I I I I I I I I I I I M I I I I 1 I I I I I I I I i I I I I I I I I i I I I I I I I I 
1201 gaagcaccaactcctgggagttgaagaataagcagtttgaagcagcagttaaacaaggag 
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N0V2a 
3457 

SECR : 
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3398 catatagcatggatacagcccagtttgatgagctgataagaaacatgagtcagctcatgg 
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12 61 catatagcatggatacagcccagtttgatgagctgataagaaacatgagtcagctcatgg 
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3517 

SECR : 
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34 58 aaaccggagaggtcagcgatgatcttgcgtcccagctgatatatcagctggtggccgaat 
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aaaccggagaggtcagcgatgatcttgcgtcccagctgatatatcagctggtggccgaat 
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SECR : 
1440 



3518 tagccaaggcacagccaacacacatgcagtggcggggcatccaggaagagacacctcctg 

I I I I I I I I I I I I I I I I I I i I M 1 i I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I M I I I I I 
1381 tagccaaggcacagccaacacacatgcagtggcggggcatccaggaagagacacctcctg 
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N0V2a 
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3578 ctgctcagctcagaggggaaacagggagtgtgtcccaaagctcgcatgcaaaaaactcag 



1441 



I I I I I M I I I I i I I I I I I I I I I I I I I i I I I I I i I I I I I I I I I I I I I I I M I I I I I I I I I I 
ctgctcagctcagaggggaaacagggagtgtgtcccaaagctcgcatgcaaaaaactcag 
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N0V2a 
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SECR : 
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3638 gcaagctgacattcaagccgaaaggacctgttctcatgaggcaaagccaacctccctcaa 
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M I I I 11 I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I M I I 
gcaagctgacattcaagccgaaaggacctgttctcatgaggcaaagccaacctccctcaa 
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N0V2a 
3757 

SECR : 
1620 



3698 tttcatttaataaaacaataaattccaggattggaaatacagtatacattacaaaaagga 



1561 



I I I M I I I M I I I I I I I I I I I i i I I I I I I M t M I I I I I I I I I i I I I I I I I I I i i I I I I I 
tttcatttaataaaacaataaattccaggattggaaatacagtatacattacaaaaagga 
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NOV2a 
3817 

SECR : 
1680 



3758 cagaggtcatcaatatactgtgtgaccttattacccccagtgaggccacatatacatgga 



1621 



I I I I i I I I t I I I I I I I I I I I I I I M I M I I I I I I I I I M I I I I I I I I I I I I I I I I I I I M 
cagaggtcatcaatatactgtgtgaccttattacccccagtgaggccacatatacatgga 



60 



N0V2a 
3877 

SECR : 
1740 



3818 ccaaggatggaaccttgttacagccctcagtaaaaataattttggatggaactgggaaga 



1681 



I I I I I I M I I I I 1 I i I I I I I I I I I I M I I I I M I I I I I I I I I I i I I I I I I M I I I I I I I I 
ccaaggatggaaccttgttacagccctcagtaaaaataattttggatggaactgggaaga 
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; 3878 tacagatacagaatcctacaaggaaagaacaaggcatatatgaatgttctgtagctaatc 

M M I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I 
1741 t acagatacagaatcct acaaggaaagaacaaggcatatat gaatgttctgtagctaatc 
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N0V2a 
3997 

SECR : 
1860 



; 3938 atcttggttcagatgtggaaagttcttctgtgctgtatgcagaggcacctgtcatcttgt 

M I N M M M M M I I I I M I M I I M I I I I I i I I I I M I I I i I I I I I I I I I I N M M 
1801 atcttggttcagatgtggaaagttcttctgtgctgtatgcagaggcacctgtcatcttgt 
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4057 

SECR : 
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3998 ctgttgaaagaaatatcaccaaaccagagcacaaccatctgtctgttgtggttggaggca 
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I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ctgttgaaagaaatatcaccaaaccagagcacaaccatctgtctgttgtggttggaggca 
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; 4058 tcgtggaggcagcccttggagcaaacgtgacaatccgatgtcctgtaaaaggtgtccctc 

I M M I I I I I I I I I I I I I I I I I I I I I I I I I I i I 1 I I I I I I I I I I i I I I I I I I I I I I I I I I 
1921 tcgtggaggcagcccttggagcaaacgtgacaatccgatgtcctgtaaaaggtgtccctc 
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NOV2a 
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SECR : 
2040 



: 4118 agcctaatataacttggttgaagagaggaggatctctgagtggcaatgtttccttgcttt 
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1981 agcctaatataacttggttgaagagaggaggatctctgagtggcaatgtttccttgcttt 
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: 4178 tcaatggatccctgttgttgcagaatgtttcccttgaaaatgaaggaacctacgtctgca 
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2041 tcaatggatccctgttgttgcagaatgtttcccttgaaaatgaaggaacctacgtctgca 



45 



N0V2a 
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SECR : 
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: 4238 tagccaccaatgctcttggaaaggcagtggcaacatctgtattccacttgctggaacgaa 
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2101 tagccaccaatgctcttggaaaggcagtggcaacatctgtactccacttgctggaacgaa 



50 



55 



NOV2a 
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: 4298 gatggccagagagtagaatcgtatttctgcaaggacataaaaagtacattctccaggcaa 

I I I I I I I I I M I I I I i I I I I I I I I I I I I I 1 I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I 
2161 gatggccagagagtagaatcgtatttctgcaaggacataaaaagtacattctccaggcaa 
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N0V2a 
4417 

SECR : 
2280 



4358 ccaacactagaaccaacagcaatgacccaacaggagaacccccgcctcaagagccttttt 
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ccaacactagaaccaacagcaatgacccaacaggagaacccccgcctcaagagccttttt 
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N0V2a : 4 418 gggagcctggtaactggtcacattgttctgccacctgtggt catttgggagcccgcattc 
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I I I I I t I I I I I I I I i N I I I M M I I I I M I I i I M I I I M I I I I I I I 1.1 I I I I I I M I I 
SECR : 2281 gggagcctggtaactggtcacattgttctgccacctgtggtcatttgggagcccgcattc 
5 2340 



N0V2a : 4 4 78 agagaccccagtgtgtgatggccaatgggcaggaagtgagtgaggccctgtgtgatcacc 
4537 

10 II I I ill Ml III MM INIIIMIill MM II MINI Mill II MUM MM M 

SECR : 2341 agagaccccagtgtgtgatggccaatgggcaggaagtgagtgaggccctgtgtgatcacc 
2400 
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N0V2a 
4657 

SECR : 
2520 



4 538 tccagaagccactggctgggtttgagccctgtaacatccgggactgcccagcgaggtggt 



2401 



M I M II M M 1 M II II I M M I M M M I M M M M M M M II M I M M M II i I 
tccagaagccactggctgggtttgagccctgtaacatccgggactgcccagcgaggtggt 



4598 tcacaagtgtgtggtcacagtgctctgtgtcttgcggtgaaggataccacagtcggcagg 
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M I II I I M M II I M I M M M M M M II M II M I M I I II M I I M M I M M M I 
tcacaagtgtgtggtcacagtgctctgtgtcttgcggtgaaggataccacagtcggcagg 
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; 4 658 tgacgtgcaagcggacaaaagccaatggaactgtgcaggtggtgtctccaagagcatgtg 
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2521 tgacgtgcaagcggacaaaagccaatggaactgtgcaggtggtgtctccaagagcatgtg 
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: 4718 cccctaaagaccggcctctgggaagaaaaccatgttttggtcatccatgtgttcagtggg 
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2581 cccctaaagaccggcctctgggaagaaaaccatgttttggtcatccatgtgttcagtggg 
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SECR : 
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4-778 aaccagggaaccggtgtcctggacgttgcatgggccgtgctgtgaggatgcagcagcgtc 
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M I M II M II M II II M I M M II M M II II M M II M M 11 II M M M M M M 
aaccagggaaccggtgtcctggacgttgcatgggccgtgctgtgaggatgcagcagcgtc 



50 N0V2a : 4838 acacagcttgtcaacacaacagctctgactccaactgtgatgacagaaagagacccacct 
4897 

II 11 II I M II II I M M M M M I I M II II I II M I M II M II I I I I i M M I M II 
SECR : 2701 acacagcttgtcaacacaacagctctgactccaactgtgatgacagaaagagacccacct 

2760 
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N0V2a : 4898 taagaaggaactgcacatcaggggcctgtgatgtgtgttggcacacaggcccttggaagc 
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M I I I I I I M I I I I II i I 11 I I M 11 I I I II II I I I i 1 II I I I M I I 1 1 I II I II I M II 
60 SECR : 27 61 taagaaggaactgcacatcaggggcctgtgatgtgtgttggcacacaggcccttggaagc 
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cctgtacagcagcctgtggcaggggtttccagtctcggaaagtcgactgtatccacacaa 
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5018 ggagttgcaaacctgtggccaagagacactgtgtacagaaaaagaaaccaatttcctggc 
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ggagttgcaaacctgtggccaagagacactgtgtacagaaaaagaaaccaatttcctggc 
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5078 ggcactgtcttgggccctcctgtgatagagactgcacagacacaactcactactgtatgt 
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ggcactgtcttgggccctcctgtgatagagactgcacagacacaactcactactgtatgt 
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3001 ttgtaaaacatcttaatttgtgttctctagaccgctacaaacaaaggtgctgccagtcat 
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5198 gtcaagagggataaacctttggaggggtcatgatgctgctgtgaagataaaagtagaata 
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: 5258 taaaagctcttttccccatgtcgctgattcaaaaacatgtatttcttaaaagactagatt 
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3121 taaaagctcttttccccatgtcgctgattcaaaaacatgtatttcttaaaagactagatt 
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5318 ctatggatcaaacagaggttgatgcaaaaacaccactgttaaggtgtaaagtgaaatttt 
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ctatggatcaaacagaggttgatgcaaaaacaccactgttaaggtgtaaagtgaaatttt 
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5437^ • 5378 ccaatggtagttttatattccaattttttaaaatgatgtattcaaggatgaacaaaatac 
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atctctctggataggctgtcacactgactgacctaagggttcatggaagcatggcatctt 



35 



40 



N0V2a 
5857 

SECR : 
3720 
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linilllMllilMIIIIIIIIIIIIIIIMIIIIIIIiMMIIiiiililiitiii 
3901 taatttatgcataaagtagtattcctgagaactttattttggaaaatttataagaaagta 



10 



N0V2a 
6157 

SECR : 
4020 



6098 atccaaataagaaacacgatagttgaaaataatttttatagtaaataattgttttgggct 



3961 



I M M I M I I I I I I I i I i I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I 
atccaaataagaaacacgatagttgaaaataatttttatagtaaataattgttttgggct 



15 



20 



N0V2a 
6217 

SECR : 
4080 



: 6158 gatttttcagtaaatccaaagtgacttaggttagaagttacactaaggaccaggggttgg 

I I M M I I I I I I I I I I i I I I I i I I I i I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I 
4021 gatttttcagtaaatccaaagtgacttaggttagaagttacactaaggaccaggggttgg 



25 



NOV2a 
6277 

SECR : 
4140 



6218 aatcagaatttagtttaagatttgaggaaaagggtaagggttagtttcagttttaggatt 



4081 



iiMillMIIIIIMIIIIIIIillMIIIMIIMIIIllllliiilliiMiiiiii 
aatcagaatttagtttaagatttgaggaaaagggtaagggttagtttcagttttaggatt 



30 



N0V2a 
6337 

SECR : 
4200 



: 6278 agagctagaattgggttaggtgagaaagaaagttaaggttaaggctagagttgtctttaa 

I M I I I I I I I I I I I r I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
4141 agagctagaattgggttaggtgagaaagaaagttaaggttaaggctagagttgtctttaa 



35 



40 



NOV2a 
6397 

SECR : 
4260 



6338 gggttagggttaggaccaggttaggtcagggttggattgggtttagattggggccagtgc 



4201 



M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I t I I I I i I I ] I I I 
gggttagggttaggaccaggttaggtcagggttggattgggtttagattggggccagtgc 



45 



N0V2a 
6457 

SECR : 
4320 



; 6398 tggtgttagtgatagtgtcaggatggaggttaggtttggagtaagcgttgttgctgaagt 

I I I I I I I i I I I 1 I I I I i I I I i I I I I I I I I I I I i I I I I I I I I I I I I I I I I M I I I I I I I I I 
4261 tggt gtt agtgatagt gtcaggat ggaggttaggtt tggagtaagcgt tgttgctgaagt 



50 



55 



NOV2a 
6517 

SECR : 
4380 



: 64 58 gagttcaggctagcattaaattgtaagttctgaagctgatttggttatggggtctttccc 

M I I I I I I I I I I I I I I I M I I I I I I I I 1 I I I I I I M I I I I I i I I I i I I I I 1 I I M I I I I I 
4321 gagttcaggctagcattaaattgtaagttctgaagctgatttggttatggggtctttccc 



60 



NOV2a 
6577 

SECR : 
4440 



6518 ctgtatactaccagttgtgtctttagatggcacacaagtccaaataagtggtcatacttc 



4381 



M I I I I I { I I I i I I I I I i I I ] 1 I I 1 I 1 I I I I I I t I I I I I I I I I I I I I I I I ( i I I I I I I I I 
ctgtatactaccagttgtgtctttagatggcacacaagtccaaataagtggtcatacttc 
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N0V2a : 6578 tttattcagggtctcagctgcctgtacacctgctgcctacatcttcttggcaacaaagtt 
6637 

I I I 1 I I I I I I I I I M I I I I 1 i I I I ) M I I I i I I I I I I I I I I I I I I I I I I I I I M I N I I I 
SECR : 44 41 tttattcagggtctcagctgcctgtacacctgctgcctacatcttcttggcaacaaagtt 
5 4500 



10 



N0V2a 
6697 

SECR : 
4560 



6638 acctgccacaggctctgctgagcctagttcctggtcagtaataactgaacagtgcatttt 



4501 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I M I I I I I I I I I I I I I I I I I I I I M 
acctgccacaggctctgctgagcctagttcctiggtcagtaataactgaacagtgcatttt 



15 



20 



25 



N0V2a 
6757 

SECR : 
4620 



N0V2a 
6817 

SECR : 
4680 



; 6698 ggctttggatgtgtctgtggacaagcttgctgagtttctctaccatattctgagcacacg 

I I I I M I I 1 M I I I I I I I I I M I I I I I I I i I I I I I I I I t I I I I M I I I I I I I I I I I I I I I 
4 561 ggctttggatgtgtctgtggacaagcttgctgagtttctctaccatattctgagcacacg 



67 58 gtctcttttgttctaacttcagcttcactgacactgggttgagcactactgtatgtggag 

I I I I 1 I I I I I M I 1 I I I I I I I I I I 1 I M I I I I I I 1 I I I M I I I I I I I I I I I I I I I I I I I 
4 621 gtctcttttgttctaatttcagcttcactgacactgggttgagcactactgtatgtggag 



30 



35 



40 



N0V2a 
6877 

SECR : 
4740 



N0V2a 
6937 

SECR : 
4800 



6818 ggtttggtgattgggaatggatgggggacagtgaggaggacacaccagcccattagttgt 



4681 



I I I i I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I j I I I I I I I I I I I I I I I I I I I 
ggtttggtgattgggaatggatgggggacagtgaggaggacacaccagcccattagttgt 



6878 taatcatcaatcacatctgattgttgaaggttattaaattaaaagaaagatcatttgtaa 



4741 



M I I t I M I I I I 1 I I I I I I I I I I I I I I I I I M I I M I I I I 1 t I I I I I I I I I I I I i M I I I 
taatcatcaatcacatctgattgttgaaggttattaaattaaaagaaagatcatttgtaa 



N0V2a : 6938 catactctttgtatatatttattatatgaaaggtgcaatattttattttgtacagtatgt 
6997 

45 i I I I I I I I I I I I I I I I i I I I 1 1 I I I I I 1 1 1 1 I I I I I I I I I I I I I I I I I I I I i I I I I I i I I 

SECR : 4801 catactctttgtatatatttattatatgaaaggtgcaatattttattttgtacagtatgt 

4860 



•50 N0V2a : 6998 aataaagacatgggacatatatttttcttattaacaaaatttcatattaaattgcttcac 
7057 

I I 1 I M I I I I I I I I I I I I I I I I I M I I I I M I I I I 1 I I 1 M I t I I I i i I I i I I I I I I M I 
SECR : 4 8 61 aataaagacatgggacatatatttttcttattaacaaaatttcatattaaattgcttcac 

4920 

55 

N0V2a : 7058 tttgtatttaaagttaaaagttactatttttcatttgctattgtactttcattgttgtca 
7117 

I I I I I I I I I I I I I I I I I I I I I i I I I I I I I M I I I M I I i I I I 1 I I I M I I I I I I I M I I I 
60 SECR : 4 921 tttgtatttaaagttaaaagttactatttttcatttgctattgtactttcattgttgtca 

4980 
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N0V2a : 7118 ttcaattgacattcctgtgtactgtattttactactgtttttataacatgagagttaatg 
7177 

I I I M N I I I I I I I I I N M i I I I I I I I I M I I I i I I I M I I I I M I I I I M 1 1 I I I M I 
^ SECR : 4981 ttcaattgacattcctgtgtactgtattttactactgtttttataacatgagagttaatg 

NOV2a : 7178 tttctgtttcatgatccttatgtaattcagaaataaatttactttgattattcagtggca 
7237 

10 N I I i M I I I I I I I I I I I I I I I I I M I I I M I I I N I I I I M I I I I I I I I 1 M I I I I M I 

SECR : 5041 tttctgtttcatgatccttatgtaattcagaaataaatttactttgattattcagtggca 

15 NOV2a : 7238 tccttat 7244 (SEQ ID NO: 60) 

I I I I I I I 

SECR : 5101 tccttat 5107 (SEQ ID NO: 26) 
TABLE? 

20 Score = 2045 bits (5300), Expect = 0.0 

Identities = 1021/1023 (99%), Positives = 1021/1023 (99%) 

AVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCGVGIQTRDV 72( 
AVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCGVGIQTRDV 
25 SECR : 1 AVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCGVGIQTRDV 60 



30 



35 



LLTDGSFLNLSDELCQGPKASSHKSCARTDCPPHLAVGDWSKCSVSCGVGIQRRKQVCQR 
LLTDGSFLNLSDELCQGPKASSHKSCARTDCPPHLAVGDWSKCSVSCGVGIQRRKQVCQR 

LAAKGRRIPLSEMMCRDLPGFPLVRSCQMPECSKIKSEMKTKLGEQGPQILSVQRVYIQT 
LAAKGRRI PLSEMMCRDLPG PLVRSCQMPECSKIKSEMKTKLGEQGPQILSVQRVYIQT 
LAAKGRRIPLSEMMCRDLPGLPLVRSCQMPECSKIKSEMKTKLGEQGPQIIiSVQRVYIQT 

REEKRINLTIGSRAYLLPNTSVIIKCPVRRFQKSLIQWEKDGRCLQNSKRLGITKSGSLK 
40 REEKRINLTIGSRAYLLPNTSVIIKCPVRRFQKSLIQWEKDGRCLQNSKRLGITKSGSLK 



50 



55 



N0V2A: 


669 


SECR : 


1 


NOV2A: 


729 




ox 


NOV2A: 


789 


SECR : 


121 


N0V2A: 


849 


SECR : 


181 


NOV2A: 


909 


SECR : 


241 


NOV2A: 


969 


1028 




SECR : 


301 


NOV2A: 


1029 


1088 




SECR : 


361 


NOV2A: 


1089 


1148 




SEC : 


421 


NOV2A: 


1149 


1208 





YCLHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQNRRVTCRQ 
YCLHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQNRRVTCRQ 120 



45 IHGLAAPDIGVYRCIAGSAQETWLKLIGTDNRLIARPALREPMREYPGMDHSEANSLGV 



TWHKMRQMWNNBCNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKNKQFEAAVKQGA 



YSMDTAQFDELIRNMSQLMETGEVSDDLASQLIYQLVAELAKAQPTHMQWRGIQEETPPA 



60 AQLRGETGSVSQSSHAKNSGKLTFKPKGPVLMRQSQPPSISFNKTINSRIGNTVYITKRT 
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SECR : 481 AQLRGETGSVSQSSHAKNSGKLTFKPKGPVLMRQSQPPSISFNKTINSRIGNTVYITKRT 540 

NOV2A: 1209 EVINILCDLITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTRKEQGIYECSVANH 
1268 

5 EVINILCDLITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTRKEQGIYECSVANH 

SECR : 541 EVINILCDLITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTRKEQGIYECSVANH 600 

NOV2A: 1269 LGSDVESSSVLYAEAPVILSVERNITKPEHNHLSVWGGIVEAALGANVTIRCPVKGVPQ 
1328 

10 LGSDVESSSVLYAEAPVILSVERNITKPEHNHLSWVGGIVEAALGANVTIRCPVKGVPQ 

SECR : 601 LGSDVESSSVLYAEAPVILSVERNITKPEHNHLSWVGGIVEAALGANVTIRCPVKGVPQ 660 

NOV2A: 1329 PNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGKAVATSVFHLLERR 
1388 

15 PNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGKAVATSV HLLERR 

SECR : 661 PNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGKAVATSVLHLLERR 720 

N0V2A: 1389 WPESRIVFLQGHKKYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATCGHLGARIQ 
1448 

20 WPESRIVFLQGHKKYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATCGHLGARIQ 

SECR : 721 WPESRIVFLQGHKKYILQATKTRTNSNDPTGEPPPQEPFWEPGNWSHCSATCGHLGARiQ 780 

NOV2A: 14 4 9 RPQCVMANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFTSVWSQCSVSCGEGYHSRQV 
1508 

25 RPQCVMANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFTSVWSQCSVSCGEGYHSRQV 

SECR : 781 RPQCVMANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFTSVWSQCSVSCGEGYHSRQV 84 0 

N0V2A: 1509 TCKRTKANGTVQVVSPRACAPKDRPLGRKPCFGHPCVQWEPGNRCPGRCMGRAVRMQQRH 
1568 

30 TCKRTKANGTVQVVSPRACAPKDRPLGRKPCFGHPCVQWEPGNRCPGRCMGRAVRMQQRH 

SECR : 841 TCKRTKANGTVQVVSPRACAPKDRPLGRKPCFGHPCVQWEPGNRCPGRCMGRAVRMQQRH 900 

NOV2A: 1569 TACQHNSSDSNCDDRKRPTLRRNCTSGACDVCWHTGPWKPCTAACGRGFQSRKVDCIHTR 
1628 

35 TACQHNSSDSNCDDRKRPTLRRNCTSGACDVCWHTGPWKPCTAACGRGFQSRKVDCIHTR 

SECR : 901 TACQHNSSDSNCDDRKRPTLRRNCTSGACDVCWHTGPWKPCTAACGRGFQSRKVDCIHTR 960 

NOV2A: 1629 SCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRYKQRCCQSC 
1688 

40 SCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRYKQRCCQSC 

SECR ; 961 SCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRYKQRCCQSC 
1020 

N0V2A: 1689 QEG 1691 (SEQ ID NO: 61) 
45 QEG 

SECR : 1021 QEG 1023 (SEQ ID NO: 27) 

Signal? and PSORT analysis indicate that NOV-2 may be localized in the endoplasmic 
reticulum, with likely cleavage sites between positions 26 and 27, Thus, it is likely that NOV- 
50 2a protein is available at the appropriate sub-cellular localization for the therapeutic uses 
described in.this application. 

Based the relatedness of the disclosed NOV -2a to KIAA1233 sequences, which are 
related to lacunin, thrombospondins, proteinases, semaphorins, ADAM-TS and properdin 
family members, the nucleic acids and proteins of the invention can have similar functions as 
55 proteins belonging to these families. 
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Functional roles attributed to this family of proteins include cell attachment, spreading, 
motility, and proliferation, cytoskeletal organization, woimd healing, and angiogenesis. 
Moreover, these proteins are expressed in the nervous systems during development and are 
thought to play roles in neuronal growth and patterning. In particular, the thrombospondin, 
5 METH-1 and ADAMTS families of proteios are potent inhibitors of angiogenesis. The 

ADAMTS proteins have also been implicated in cleavage of proteglycans and the control of 
organ shape during development. In addition, the thrombospondins have been implicated in 
the activation of both transforming growth factor- beta (TGF-p) precursors and TGF-p in a 
variety of disease states. Fxuthermore, semaphorin proteins have shown expression in 

10 undifferentiated neuroepithelimn, suggesting that these proteins are actors in axonal guidance. 
Thus, the NOV-2a sequences of the invention is implicated in the following diseases and 
processes and has therapeutic uses in these diseases and processes: (i) inflammation, (ii/ 
cancer, (iii) neuronal development and axonal guidance, (iv) angiogenesis and vasculogenesis 
— m cancer as well as for ischemia, and (v) tissue regeneration in vivo and in vitro, (vi) and 

1 5 other diseases and disorders. 



NOV 2b: 

A NOV-2b nucleic acid of the invention, encoding a KIAA1233-like protein, is found 
within the nucleotide sequence of NOV-2a (SEQ ID NO: 3) in Table 5. The disclosed nucleic 

20 acid is 6303 nucleotides in length and contains an open reading frame (ORF) that begins with 
an ATG initiation codon at nucleotide 425 and ends with a TAA stop codon at nucleotides 
4268 (SEQ ID NO: 57). The initiation and stop codons of NOV-2b are shown in bold font in 
SEQ ID NO: 4. The representative ORF encodes a 406 amino acid polypeptide (SEQ ED NO: 
5), which is shown below in Table 8. Putative untranslated regions are upstream of the 

25 initiation codon and downstream of the stop codon in SEQ ID NO: 57. 



TABLES 

TATAATTATTAATAGAGACCTTTCAAAGGACAAATTCTGTGAAATAAAGTGGTTTTCTGA 
AGAGCCTACTAATAGGACAGTGTGTTAATATCACTAATAAGAGAGTAATGATTATAAAAA 

30 GGAATAAATTTATTGAAATTGCAAGATACTTTTCTCCTTTGATTAATATACTGCTAGTTT 
AGTTTTCTACATTTTCAAATAGAACTGGGGAATTTGTGTCGTAGATATTCTTGACAACTA 
AAGAGATGGTGGCTGAATTTTTGGGAATGGTTGATAACACTTGATATTTTTAGTTTCCAA 
TTTGGAAGAGCTCTGTCTCTTGGGATGTCAAATATTATATTCGTCAATTAATGAATGTGT 
TAATTTATTATAGAAATGATATTCTCACAATGATTTCATTTGTAGTGATGGATTTAAAGA 

35 GATAATGCCCTATGACCACTTCCAACCTCTTCCTCGCTGGGAACATAATCCTTGGACTGC 
ATGTTCCGTGTCCTGTGGAGGAGGGATTCAGAGACGGAGCTTTGTGTGTGTAGAGGAATC 
CATGCATGGAGAGATATTGCAGGTGGAAGAATGGAAGTGCATGTACGCACCCAAACCCAA 
GGTTATGCAAACTTGTAATCTGTTTGATTGCCCCAAGTGGATTGCCATGGAGTGGTCTCA 
GTGCACAGTGACTTGTGGCCGAGGGTTACGGTACCGGGTTGTTCTGTGTATTAACCACCG 

40 CGGAGAGCATGTTGGGGGCTGCAATCCACAACTGAAGTTACACATCAAAGAAGAATGTGT 
CATTCCCATCCCGTGTTATAAACCAAAAGAAAAAAGTCCAGTGGAAGCAAAATTGCCTTG 
GCTGAAACAAGCACAAGAACTAGAAGAGACCAGAATAGCAACAGAAGAACCAACGTTCAT 
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TCCAGAACCCTGGTCAGCCTGCAGTACCACGTGTGGGCCGGGTGTGCAGGTCCGTGAGGT 
GAAGTGCCGTGTGCTCCTCACATTCACGCAGACTGAGACTGAGCTGCCCGAGGAAGAGTG 
TGAAGGCCCCAAGCTGCCCACCGAACGGCCCTGCCTCCTGGAAGCATGTGATGAGAGCCC 
GGCCTCCCGAGAGCTAGACATCCCTCTCCCTGAGGACAGTGAGACGACTTACGACTGGGA 
5 GTACGCTGGGTTCACCCCTTGCACAGCAACATGCGTGGGAGGCCATCAAGAAGCCATAGC 
AGTGTGCTTACATATCCAGACCCAGCAGACAGTCAATGACAGCTTGTGTGATATGGTCCA 
CCGTCCTCCAGCCATGAGCCAGGCCTGTAACACAGAGCCCTGTCCCCCCAGGTGGCATGT 
GGGCTCTTGGGGGCCCTGCTCAGCTACCTGTGGAGTTGGAATTCAGACCCGAGATGTGTA 
CTGCCTGCACCCAGGGGAGACCCCTGCCCCTCCTGAGGAGTGCCGAGATGAAAAGCCCCA 

10 TGCTTTACAAGCATGCAATCAGTTTGACTGCCCTCCTGGCTGGCACATTGAAGAATGGCA 
GCAGTGTTCCAGGACTTGTGGCGGGGGAACTCAGAACAGAAGAGTCACCTGTCGGCAGCT 
GCTAACGGATGGCAGCTTTTTGAATCTCTCAGATGAATTGTGCCAAGGACCCAAGGCATC 
GTCTCACAAGTCCTGTGCCAGGACAGACTGTCCTCCACATTTAGCTGTGGGAGACTGGTC 
GAAGTGTTCTGTCAGTTGTGGTGTTGGAATCCAGAGAAGAAAGCAGGTGTGTCAAAGGCT 

15 GGCAGCCAAAGGTCGGCGCATCCCCCTCAGTGAGATGATGTGCAGGGATCTACCAGGGTT 
CCCTCTTGTAAGATCTTGCCAGATGCCTGAGTGCAGTAAAATCAAATCAGAGATGAAGAC 
AAAACTTGGTGAGCAGGGTCCGCAGATCCTCAGTGTCCAGAGAGTCTACATTCAGACAAG 
GGAAGAGAAGCGTATTAACCTGACCATTGGTAGCAGAGCCTATTTGCTGCCCAACACATC 
CGTGATTATTAAGTGCCCCGTGCGACGATTCCAGAAATCTCTGATCCAGTGGGAGAAGGA 

20 TGGCCGTTGCCTGCAGAACTCCAAACGGCTTGGCATCACCAAGTCAGGCTCACTAAAAAT 
CCACGGTCTTGCTGCCCCCGACATCGGCGTGTACCGGTGCATTGCAGGCTCTGCACAGGA 
AACAGTTGTGCTCAAGCTCATTGGTACTGACAACCGGCTCATCGCACGCCCAGCCCTCAG 
GGAGCCTATGAGGGAATATCCTGGGATGGACCACAGCGAAGCCAATAGTTTGGGAGTCAC 
ATGGCACAAAATGAGGCAAATGTGGAATAACAAAAATGACCTTTATCTGGATGATGACCA 

25 CATTAGTAACCAGCCTTTCTTGAGAGCTCTGTTAGGCCACTGCAGCAATTCTGCAGGAAG 
CACCAACTCCTGGGAGTTGAAGAATAAGCAGTTTGAAGCAGCAGTTAAACAAGGAGCATA 
TAGCATGGATACAGCCCAGTTTGATGAGCTGATAAGAAACATGAGTCAGCTCATGGAAAC 
CGGAGAGGTCAGCGATGATCTTGCGTCCCAGCTGATATATCAGCTGGTGGCCGAATTAGC 
CAAGGCACAGCCAACACACATGCAGTGGCGGGGCATCCAGGAAGAGACACCTCCTGCTGC 

30 TCAGCTCAGAGGGGAAACAGGGAGTGTGTCCCAAAGCTCGCATGCAAAAAACTCAGGCAA 
GCTGACATTCAAGCCGAAAGGACCTGTTCTCATGAGGCAAAGCCAACCTCCCTCAATTTC 
ATTTAATAAAACAATAAATTCCAGGATTGGAAATACAGTATACATTACAAAT^GGACAGA 
GGTCATCAATATACTGTGTGACCTTATTACCCCCAGTGAGGCCACATATACATGGACCAA 
GGATGGAACCTTGTTACAGCCCTCAGTAAAAATAATTTTGGATGGAACTGGGAAGATACA 

35 GATACAGAATCCTACAAGGAAAGAACAAGGCATATATGAATGTTCTGTAGCTAATCATCT 
TGGTTCAGATGTGGAAAGTTCTTCTGTGCTGTATGCAGAGGCACCTGTCATCTTGTCTGT 
TGAAAGAAATATCACCAAACCAGAGCACAACCATCTGTCTGTTGTGGTTGGAGGCATCGT 
GGAGGCAGCCCTTGGAGCAAACGTGACAATCCGATGTCCTGTAAAAGGTGTCCCTCAGCC 
TAATATAACTTGGTTGAAGAGAGGAGGATCTCTGAGTGGCAATGTTTCCTTGCTTTTCAA 

40 TGGATCCCTGTTGTTGCAGAATGTTTCCCTTGAAAATGAAGGAACCTACGTCTGCATAGC 
CACCAATGCTCTTGGAAAGGCAGTGGCAACATCTGTACTCCACTTGCTGGAACGAAGATG 
GCCAGAGAGTAGAATCGTATTTCTGCAAGGACATAAAAAGTACATTCTCCAGGCAACCAA 
CACTAGAACCAACAGCAATGACCCAACAGGAGAACCCCCGCCTCAAGAGCCTTTTTGGGA 
GCCTGGTAACTGGTCACATTGTTCTGCCACCTGTGGTCATTTGGGAGCCCGCATTCAGAG 

45 ACCCCAGTGTGTGATGGCCAATGGGCAGGAAGTGAGTGAGGCCCTGTGTGATCACCTCCA 
GAAGCCACTGGCTGGGTTTGAGCCCTGTAACATCCGGGACTGCCCAGCGAGGTGGTTCAC 
AAGTGTGTGGTCACAGTGCTCTGTGTCTTGCGGTGAAGGATACCACAGTCGGCAGGTGAC 
GTGCAAGCGGACAi\AAGCCAATGGAACTGTGCAGGTGGTGTCTCCAAGAGCATGTGCCCC 
TAAAGACCGGCCTCTGGGAAGAAAACCATGTTTTGGTCATCCATGTGTTCAGTGGGAACC 

50 AGGGAACCGGTGTCCTGGACGTTGCATGGGCCGTGCTGTGAGGATGCAGCAGCGTCACAC 
AGCTTGTCAACACAACAGCTCTGACTCCAACTGTGATGACAGAAAGAGACCCACCTTAAG 
AAGGAACTGCACATCAGGGGCCTGTGATGTGTGTTGGCACACAGGCCCTTGGAAGCCCTG 
TACAGCAGCCTGTGGCAGGGGTTTCCAGTCTCGGAAAGTCGACTGTATCCACACAAGGAG 
TTGCAAACCTGTGGCCAAGAGACACTGTGTACAGAAAAAGAAACCAATTTCCTGGCGGCA 

55 CTGTCTTGGGCCCTCCTGTGATAGAGACTGCACAGACACAACTCACTACTGTATGTTTGT 
AAAACATCTTAATTTGTGTTCTCTAGACCGCTACAAACAAAGGTGCTGCCAGTCATGTCA 
AGAGGGATAAACCTTTGGAGGGGTCATGATGCTGCTGTGAAGATAAAAGTAGAATATAAA 
AGCTCTTTTCCCCATGTCGCTGATTCAAAAACATGTATTTCTTAAAAGACTAGATTCTAT 
GGATCAAACAGAGGTTGATGCAAAAACACCACTGTTAAGGTGTAAAGTGAAATTTTCCAA 

60 TGGTAGTTTTATATTCCAATTTTTTAAAATGATGTATTCAAGGATGAACAAAATACTATA 
GCATGCATGCCACTGCACTTGGGACCTCATCATGTCAGTTGAATCGAGAAATCACCAAGA 
TTATGAGTGCATCCTCACGTGCTGCCTCTTTCCTGTGATATGTAGACTAGCACAGAGTGG 
TACATCCTAAAAACTTGGGAAACACAGCAACCCATGACTTCCTCTTCTCTCAAGTTGCAG 
GTTTTCAACAGTTTTATAAGGTATTTGCATTTTAGAAGCTCTGGCCAGTAGTTGTTAAGA 
65 TGTTGGCATTAATGGCATTTTCATAGATCCTTGGTTTAGTCTGTGAAAAAGAAACCATCT 
CTCTGGATAGGCTGTCACACTGACTGACCTAAGGGTTCATGGAAGCATGGCATCTTGTCC 
TTGCTTTTAGAACACCCATGGAAGAAAACACAGAGTAGATATTGCTGTCATTTATACAAC 
TACAGAAATTTATCTATGACCTAATGAGGCATCTCGGAAGTCAAAGAAGAGGGAAAGTTA 
ACCTTTTCTACTGATTTCGTAGTATATTCAGAGCTTTCTTTTAAGAGCTGTGAATGAAAC 
70 TTTTTCTAAGCACTATTCTATTGCACACAAACAGAAAACCAAAGCCTTATTAGACCTAAT 
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TTATGCATAAAGTAGTATTCCTGAGAACTTTATTTTGGAAAATTTATAAGAAAGTAATCC 
AAATAAGAAACACGATAGTTGAAAATAATTTTTATAGTAAATAATTGTTTTGGGCTGATT 
TTTCAGTAAATCCAAAGTGACTTAGGTTAGAAGTTACACTAAGGACCAGGGGTTGGAATC 
AGAATTTAGTTTAAGATTTGAGGAAAAGGGTAAGGGTTAGTTTCAGTTTTAGGATTAGAG* 
5 CTAGAATTGGGTTAGGTGAGAAAGAAAGTTAAGGTTAAGGCTAGAGTTGTCTTTAAGGGT 
TAGGGTTAGGACCAGGTTAGGTCAGGGTTGGATTGGGTTTAGATTGGGGCCAGTGCTGGT 
GTTAGTGATAGTGTCAGGATGGAGGTTAGGTTTGGAGTAAGCGTTGTtGCTGAAGTGAGT 
TCAGGCTAGCATTAAATTGTAAGTTCTGAAGCTGATTTGGTTATGGGGTCTTTCCCCTGT 
ATACTACCAGTTGTGTCTTTAGATGGCACACAAGTCCAAATAAGTGGTCATACTTCTTTA 

10 TTCAGGGTCTCAGCTGCCTGTACACCTGCTGCCTACATCTTCTTGGCAACAAAGTTACCT 
GCCACAGGCTCTGCTGAGCCTAGTTCCTGGTCAGTAATAACTGAACAGTGCATTTTGGCT 
TTGGATGTGTCTGTGGACAAGCTTGCTGAGTTTCTCTACCATATTCTGAGCACACGGTCT 
CTTTTGTTCTAATTTCAGCTTCACTGACACTGGGTTGAGCACTACTGTATGTGGAGGGTT 
TGGTGATTGGGAATGGATGGGGGACAGTGAGGAGGACACACCAGCCCATTAGTTGTTAAT 

1 5 CATCAATCACATCTGATTGTTGAAGGTTATTAAATTAAAAGAAAGATCATTTGTAACATA 
CTCTTTGTATATATTTATTATATGAAAGGTGCAATATTTTATTTTGTACAGTATGTAATA 
AAGACATGGGACATATATTTTTCTTATTAACAAAATTTCATATTAAATTGCTTCACTTTG 
TATTTAAAGTTA/^AAGTTACTATTTTTCATTTGCTATTGTACTTTCATTGTTGTCATTCA 
ATTGACATTCCTGTGTACTGTATTTTACTACTGTTTTTATAACATGAGAGTTAATGTTTC 

20 TGTTTCATGATCCTTATGTAATTCAGAAATAAATTTACTTTGATTATTCAGTGGCATCCT 

TAT (SEQIDNO:57) 



MPYDHFQPLPRWEHNPWTACSVSCGGGIQRRSE^CVEESMHGEILQVEEWKCMYAPKPKVMQTCNLFDCPKWIAME 
WSQCTVTCGRGLRYRWLCINHRGEHVGGCNPQLKLHIKEECVIPIPCYKPKEKSPVEAKLPWLKQAQELEETRIA 

25 TEEPTFIPEPWSACSTTCGPGVQVREVKCRVLLTFTQTETELPEEECEGPKLPTERPCLLEACDESPASRELDIPIi 
PEDSETTYDWEYAGFTPCTATCVGGHQEAIAVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGP 
CSATCGVGIQTRDVYCLHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQNRRVTCRQLL 
TDGSFLNLSDELCQGPKASSHKSCARTDCPPHLAVGDWSKCSVSCGVGIQRRKQVCQRLAAKGRRIPLSEMMCRDL 
PGFPLVRSCQMPECSKIKSEMKTKLGEQGPQILSVQRVYIQTREEKRINLTIGSRAYLLPNTSVIIKCPVRRFQKS 

30 LIQWEKDGRCLQNSKRLGITKSGSLKIHGLAAPDIGVYRCIAGSAQETWLKLIGTDNRLIARPALREPMREYPGM 
DHSEANSLGVTWHKMRQMWNNBCNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKNKQFEAAVKQGAYSMDTA 
QFDELIRNMSQLMETGEVSDDLASQLIYQLVAELAKAQPTHMQWRGIQEETPPAAQLRGETGSVSQSSHAKNSGKL 
TFKPKGPVLMRQSQPPSISFNKTINSRIGNTVYITKRTEVINILCDLITPSEATYTWTKDGTLLQPSVKIILDGTG 
KIQIQNPTRKEQGIYECSVANHLGSDVESSSVLYAEAPVILSVERNITKPEHNHLSWVGGIVEAALGANVTIRCP 

35 VKGVPQPNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGPCAVATSVLHLLERRWPESRIVFLQ 
GHKKYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATCGHLGARIQRPQCVMANGQEVSEALCDHLQKPLAG 
FEPCNIRDCPARWFTSVWSQCSVSCGEGYHSRQVTCKRTKANGTVQWSPRACAPKDRPLGRKPCFGHPCVQWEPG 
NRCPGRCMGRAVRMQQRHTACQHNSSDSNCDDRKRPTLRRNCTSGACDVCWHTGPWKPCTAACGRGFQSRKVDCIH 
TRSCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRYKQRCCQSCQEG (SEQ ID 

40 NO: 5) 

Table 9 shows a multiple sequence alignment of NOV- 1, NOV-2a, and NOV-2b 
polypeptides with a KIAA1233 protein (GenBank Accession No: BAA86547), that 
demonstrates the homology between disclosed sequences according to the invention and a 
45 known member of the protein family. 



TABLE 9 
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KIAA1233 : ■ • ■ 

NOVl 

NOV2b 

NOV2a MASWTSPWWVLIGMVFMHSPLPQTTAEKSPGAYFLPEFALSPQGSFLEDTTGEQFLTYRY 

KIAA1233 

NOVl 

NOV2b : 

N0V2a DDQTSRNTRSDEDKDGNWDAWGDWSDCSRTCGGGASYSLRRCLTGRNCEGQNIRYKTCSN 

KIAA1233 

NOVl 

N0V2b 

NOV2a HDCPPDAEDFRAQQCSAYNDVQYQGHYYEWLPRYNDPAAPCALKCHAQGQNLWELAPKV 

KIAA1233 . 

NOVl 

N0V2b 

N0V2a LDGTRCNTDSLDMCISGICQAVGCDRQLGSNAKEDNCGVCAGDGSTCPvLVRGQSKSHVSP 

KIAA1233 T ^ 

NOVl 

NOV2b 

N0V2a EKREEKVIAVPLGSRSVRITVKGPAHLFIESKTLQGSKGEHSFNSPGVFWENTTVEFQR 

KIAA1233 

NOVl 

N0V2b 

N0V2a GSERQTFKIPGPLMADFIFKTRYTAAKDSVVQFFFYQPISHQWRQTDFFPCTVTCGGGYQ 

KIAA1233 

NOVl MPYDHFQPLP 

N0V2b MPYDHFQPLP 

N0V2a LNSAECVDIRLKRVVPDHYCHYYPENVKPKPKLKECSMDPCPSSDGFKEIMPYDHFQPLP 

KIAA1233 . . 

NOVl RWEHNPWTACSVSCGGGIQRRSFVCVEESMHGEILQVEEWKCMYAPKPKVMQTCNLFDCP 
N0V2b RWEHNPWTACSVSCGGGIQRRSFVCVEESMHGEILQVEEWKCMYAPKPKVMQTCNLFDCP 
NOV2a RWEHNPWTACSVSCGGGIQRRSFVCVEESMHGEILQVEEWKCMYAPKPKVMQTCNLFDCP 

KIAA1233 — • 

NOVl KWIAMEWSQCTVTCGRGLRYRWLCINHRGEHVGGCNPQLKLHIKEECVIPIPCYKPKEK 
N0V2b KWIAMEWSQCTVTCGRGLRYRWLCINHRGEHVGGCNPQLKLHIKEECVIPIPCYKPKEK 
N0V2a KWIT^EWSQCTVTCGRGLRYRWLCINHRGEHVGGCNPQLKLKIKEECVIPIPCYKPKEK 

KIAA1233 

NOVl SPVEAKLPWLKQAQELEETRIATEEPTFIPEPWSACSTTCGPGVQVREVKCRVLLTFTQT 
NOV2b SPVEAKLPWLKQAQELEETRIATEEPTFIPEPWSACSTTCGPGVQVREVKCRVLLTFTQT 
NOV2a SPVEAKLPWLKQAQELEETRIATEEPTFIPEPWSACSTTCGPGVQVREVKCRVLLTFTQT 

KIAA1233 

NOVl ETELPEEECEGPKLPTERPCLLEACDESPASRELDIPLPEDSETTYDWEYAGFTPCTATC 
NOV2b ETELPEEECEGPKLPTERPCLLEACDESPASRELDIPLPEDSETTYDWEYAGFTPCTATC 
NOV2a ETELPEEECEGPKLPTERPCLLEACDESPASRELDIPLPEDSETTYDWEYAGFTPCTATC 



KIAA1233 AVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCG 

NOVl VGGHQEAIAVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCG 
N0V2b VGGHQEAIAVCLHIQTQQTVNDSL.CDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCG 
NOV2a VGGHQEAIAVCLHIQTQQTVNDSLCDMVHRPPAMSQACNTEPCPPRWHVGSWGPCSATCG 
gQ **************************************************** 

KIAA1233 VGIQTRDVYCIiHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQ 
NOVl VGIQTRDVYCLHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQ 
NOV2b VGIQTRDVYCLHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQ 
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NOV2a VGIQTRDVYCLHPGETPAPPEECRDEKPHALQACNQFDCPPGWHIEEWQQCSRTCGGGTQ 

KIAA1233 NRRVTCRQLLTDGSFLNLSDELCQGPKASSHKSCARTDCPPHLAVGDWSKCSVSCGVGIQ 
5 NOVl NRRVTCRQLLTDGSFLNLSDELCQGPKASSHKSCARTDCPPHLAVGDWSKCSVSCGVGIQ 
N0V2b NRRVTCRQIiLTDGSFLNLSDELCQGPECASSHKSCARTDCPPHLAVGDWSKCSVSCGVGIQ 
NOV2a NRRVTCRQLLTDGSFLNLSDELCQGPKASSHKSCARTDCPPHLAVGDWSKCSVSCGVGIQ 

10 KIAA1233 RRKQVCQRLAAKGRRIPLSEMMCRDLPGLPLVRSCQMPECSKIKSEMKTKLGEQGPQILS 
NOVl RRKQVCQRLAAKGRRIPLSEMMCRDLPGLPLVRSCQMPECSKIKSEMKTKLGEQGPQILS 
NOV2b RRKQVCQRLAAKGRRIPLSEMMCRDLPGFPLVRSCQMPECSKIKSEMKTKLGEQGPQILS 
NOV2a RRKQVCQRLAAKGRRIPLSEMMCRDLPGFPLVRSCQMPECSKIKSEMKTKLGEQGPQILS 

15 

KIAA1233 VQRVYIQTREEKRINLTIGSRAYLLPNTSVIIKCPVRRFQKSLIQWEKDGRCLQNSKRLG 
NOVl VQRVYIQTREEKRINLTIGSRAYLLPNTSVIIKCPVRRFQKSLIQWEKDGRCLQNSKRLG 
NOV2b VQRVYIQTREEKRINLTIGSRAYLLPNTSVIIKCPVRRFQKSLIQWEKDGRCLQNSKRLG 
NOV2a VQRVYIQTREEKRINLTIGSRAYLLPNTSVIIKCPVRRFQKSLIQWEKDGRCLQNSKRLG 

KIAA1233 ITKSGSLKIHGLAAPDIGVYRCIAGSAQETWLKLIGTDNRLIARPALREPMREYPGMDH 
NOVl ITKSGSLKIHGLi\APDIGVYRCIAGSAQETWLKLIGTDNRLIARPALREPMREYPGMDH 
NOV2b ITKSGSLKIHGLAAPDIGVYRCIAGSAQETWLKLIGTDNRLIARPALREPMREYPGMDH 
25 NOV2a ITKSGSLKIHGLAAPDIGVYRCIAGSAQETWLKLIGTDNRLIARPALREPMREYPGMDH 

KIAA1233 SEANSLGVTWHKMRQMWNNKNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKNKQF 
NOVl SEANSLGVTWHKMRQMWNNKNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKNKQF 
30 NOV2b SEANSLGVTWHKMRQMWNNKNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKNKQF 
NOV2a SEANSLGVTWHKMRQMWNNKNDLYLDDDHISNQPFLRALLGHCSNSAGSTNSWELKNKQF 

KIAA1233 EAAVKQGAYSMDTAQFDELIRNMSQLMETGEVSDDLASQLIYQLVAELAKAQPTHMQWRG 
35 NOVl EAAVKQGAYSMDTAQFDELIRNMSQLMETGEVSDDLASQLIYQLVAELAKAQPTHMQWRG 
NOV2b EAAVKQGAYSMDTAQFDELIRNMSQLMETGEVSDDLASQLIYQLVAELAKAQPTHMQWRG 
NOV2a EAAVKQGAYSMDTAQFDELIRNMSQLMETGEVSDDLASQLIYQLVAELAKAQPTHMQWRG 

40 KIAA1233 IQEETPPAAQLRGETGSVSQSSHAKNSGKLTFKPKGPVLMRQSQPPSISFNKTINSRIGN 
NOVl IQEETPPAAQLRGETGSVSQSSHAKNSGKLTFKPKGPVLMRQSQPPSISFNKTINSRIGN 
NOV2b IQEETPPAAQLRGETGSVSQSSHAKNSGKLTFKPKGPVLMRQSQPPSISFNKTINSRIGN 
NOV2a IQEETPPAAQLRGETGSVSQSSHAKNSGKLTFKPKGPVLMRQSQPPSISFNKTINSRIGN 

45 

KIAA1233 TVYITKRTEVINILCDLITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTRKEQGI 
NOVl TVYITKRTEVINILCDLITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTRKEQGI 
NOV2b TVYITKRTEVINILCDLITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTRKEQGI 
NOV2a TVYITKRTEVINILCDLITPSEATYTWTKDGTLLQPSVKIILDGTGKIQIQNPTRKEQGI 

50 ****************************************************** 

KIAA1233 YECSVANHLGSDVESSSVLYAEAPVILSVERNITKPEHNHLSVWGGIVET^LGANVTIR 
NOVl YECS VANHLGS DVES S S VL YAEAPVI LS VERN I T KPEHNHLS WVGGI VEAALG AN VT I R 
NOV2b YECSVANHLGSDVESSSVLYAEAPVILSVERNITKPEHNHLSVWGGIVEAALGANVTIR 
55 NOV2a YECSVANHLGSDVESSSVLYAEAPVILSVERNITKPEHNHLSVWGGIVEAALGT^VTIR 

KIAA1233 CPVKGVPQPNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGKAVATS 
NOVl CPVKGVPQPNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGKAVATS 
60 N0V2b CPVKGVPQPNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGfCAVATS 
N0V2a CPVKGVPQPNITWLKRGGSLSGNVSLLFNGSLLLQNVSLENEGTYVCIATNALGKAVATS 
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KIAA1233 VLHLLERRWPESRIVFLQGHKKYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATC 
NOVl VLHLLERRWPESRIVFLQGHKKYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATC 
N0V2b VTjHLLERRWPESRIVFLQGHKKYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATC 
NOV2a VFHLLERRWPESRIVFLQGHKKYILQATNTRTNSNDPTGEPPPQEPFWEPGNWSHCSATC 

5 ********************************************** 

KIAA1233 GHLGARIQRPQCVMANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFTSVWSQCSVSCG 
NOVl GHLGARIQRPQCVMANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFTSVWSQCSVSCG 
N0V2b GHLGARIQRPQCVMANGQEVSEALCDHLQKPLAGFEPCNIRDCPARWFTSVWSQCSVSCG 

10 N0V2a ghlgariqrpqcvmangqevsealcdhlqkplagfepcnirdcparwftsvwsqcsvscg 

KIAA1233 egyhsrqvtckrtpcangtvqvvspracapkdrplgrkpcfghpcvqwepgnrcpgrcmgr 
NOVl egyhsrqvtckrtkangtvqvvspracapkdrplgrkpcfghpcvqwepgnrcpgrcmgr 
15 N0V2b egyhsrqvtckrtkangtvqwspracapkdrplgrkpcfghpcvqwepgnrcpgrcmgr 
NOV2a egyhsrqvtckrtkangtvqwspracapkdrplgrkpcfghpcvqwepgnrcpgrcmgr 

************************************************************ 

KIAA1233 avrmqqrhtacqhnssdsncddrkrptlrrnctsgacdvcwhtgpwkpctaacgrgfqsr 
20 NOVl avrmqqrhtacqhnssdsncddrkrptlrrnctsgacdvcwhtgpwkpctaacgrgfqsr 

N0V2b AVRMQQRHTACQHNSSDSNCDDRKRPTLRRNCTSGACDVCWHTGPWKPCTAACGRGFQSR 
N0V2a AVRMQQRHTACQHNSSDSNCDDRKRPTLRRNCTSGACDVCWHTGPWKPCTAACGRGFQSR 
******** ****************************** ********************* 

25 KIAA1233 KVDCIHTRSCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRY 
NOVl KVDCIHTRSCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRY 
N0V2b KVDCIHTRSCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLNLCSLDRY 
N0V2a KVDCIHTRSCKPVAKRHCVQKKKPISWRHCLGPSCDRDCTDTTHYCMFVKHLlSiLCSLDRY 
************************************************************ 

30 

KIAA1233 KQRCCQSCQEG (SEQ ID NO: 28) 
NOVl KQRCCQSCQEG (SEQ ID NO: 2) 
N0V2b KQRCCQSCQEG (SEQ ID NO: 5) 
N0V2a KQRCCQSCQEG (SEQ ID NO: 4) 

35 ********** 
Consensus key 

* - single, fiiUy conserved residue 

: - conservation of strong groups 

. - conservation of weak groups - no consensus 

40 

Based the relatedness of the disclosed NOV-2b to the disclosed NOV-1, the disclosed 
NOV-2a, and KIAA1233 sequences, which as noted are related to lacunin, thrombospondins, 
proteinases, semaphorins, ADAM-TS and properdin family members, the nucleic acids and 
proteins of the invention can have similar functions as proteins belonging to these families. 
45 Thus, the invention is implicated in the following diseases and processes and has therapeutic 
uses in these diseases and processes: (i) inflammation, (ii) cancer, (iii) neuronal development 
and axonal guidance, (iv) angiogenesis and vasculogenesis - in cancer as well as for ischemia, 
and (v) tissue regeneration in vivo and iji vitro, and (vi) and other diseases and disorders. 

Functional roles attributed to this family of proteins include cell attachment, spreading, 
50 motility, and proliferation, cytoskeletal organization, wound healing, and angiogenesis. 

Moreover, these proteins are expressed in the nervous systems during development and are 
thought to play roles in neuronal growth and patterning. In particular, the thrombospondin, 
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METH-1 and ADAMTS families of proteins are potent inhibitors of angiogenesis. The 
ADAMTS proteins have also been implicated in cleavage of proteglycans and the control of 
organ shape during development. La addition, the thrombospondins have been impUcated in 
the activation of both transforming growth factor-beta (TGF-P) precursors and TGF-p in a 
5 variety of disease states. Furthermore, semaphorin proteins have shown expression in 

imdifferentiated neuroepithelium, suggesting that these proteins are actors in axonal guidance. 

The novel nucleic acids of the invention encoding human proteins includes the nucleic 
acids whose sequences are provided as NOV-1, NOV-2a, and NOV-2b, respectively, or 
fragments thereof. The invention also includes mutant or variant nucleic acids any of whose 

10 bases may be changed from the corresponding bases shown as NOV-1, NOV-2a, and NOV- 
2b, while still encoding a protein that mamtains its human KIAA1233-like proteins activities 
and physiological ftinctions, or a fragment of such nucleic acids. The invention ftirther^ 
includes nucleic acids whose sequences are complementary to those just described, including 
nucleic acid fragments that are complementary to any of the nucleic acids just described. The 

15 invention additionally includes nucleic acids or nucleic acid fragments, or complements 

thereto, whose structures include chemical modifications. Such modifications include, by way 
of non-limiting example, modified bases, and nucleic acids whose sugar phosphate backbones 
are modified or derivatized. These modifications are carried out at least in part to enhance the 
chemical stability of the modified nucleic acid, such that they may be used, for example, as 

20 anti-sense binding nucleic acids in therapeutic applications in a subject. 

The novel proteins of the invention includes the human KIAA1233-like proteins whose 
sequences are provided as NOV-1, NOV-2a, and NOV-2b, respectively. The invention also 
includes a mutant or variant protein any of whose residues may be changed from the 
corresponding residues shown as NOV-1, NOV-2a, and NOV-2b, while still encoding a 

25 protein that maintains its human KlAA1233-like protein activities and physiological 
ftinctions, or a ftmctional fragment thereof. 

The invention fiirther encompasses antibodies and antibody fragments, such as Fab or 
(Fab)2, that bind immunospecifically to any of the proteins of the invention. 

The expression pattern, and protein similarity information for the invention suggest 

30 that NOV-1, NOV-2a and NOV-2b may fimction as human KIAA1233-like proteins. 

Therefore, the nucleic acid and protein of the invention are usefial in potential therapeutic 
applications implicated, for example but not limited to, (i) inflammation, (ii) cancer, (iii) 
neuronal development and axonal guidance, (iv) angiogenesis and vasculogenesis — in cancer 
as well as for ischemia, and (v) tissue regeneration in vivo and in vitro, (vi) and other diseases 
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and disorders. The homology to antigenic secreted and membrane proteins also suggests that 
antibodies directed against the novel genes may be useful in treatment and prevention of (i) 
inflammation, (ii) cancer, (iii) neuronal development and axonal guidance, (iv) angiogenesis 
and vasculogenesis - in cancer as well as for ischemia, and (v) tissue regeneration in vivo and 
5 in vitro, and (vi) other diseases and disorders. 

Potential therapeutic uses for the invention(s) are, for example but not limited to, the 
following: (i) protein therapeutic, (ii) small molecule drug target, (iii) antibody target 
(therapeutic, diagnostic, dmg targeting/cytotoxic antibody), (iv) diagnostic and/or prognostic 
marker, (v) gene therapy (gene delivery/gene ablation), (vi) research tools, and (vii) tissue 
10 regeneration in vitro and in vivo (regeneration for all these tissues and cell types composing 
these tissues and cell types derived from these tissues. 

NOV-3: A Novel STE20 Protein Kinase 

The NOV-3 sequences (NOV-3a, NOV-3b, NOV-3c, and NOV-3d) according to the 

15 invention are splice variants related to STE20 protein kinases. The differences between the 
four sequences relate to the four ways of independently combining two deletions arising from 
two splice variants in the mRNAs. 

Splice variants are sequences that occm* naturally within the cells and tissues of 
individuals. The physiological activity of splice variant products and the original protein, from 

20 which they are varied, may be the same (although perhaps at a different level), opposite, or 
completely different and unrelated. In addition, variants may have no activity at all. When a 
variant and the original sequence have the same or opposite activity, they may differ in various 
properties not directly connected to biological activity, such as stability, clearance rate, tissue 
and cellular localization, temporal pattem of expression, up or down regulation mechanisms, 

25 and responses to agonists or antagonists. The presence or level of specific splice variants nlay 
be the cause, and/or indicative of, a disease, disorder, pathological or normal condition. 

Because a drug may be effective against one variant but not another, or may cause side 
effects because it targets all splice variants, an effective drug needs to target the particular 
splice variant. Because soluble variants with therapeutic or disease-related fimctions may be 

30 naturally occurring in specific tissues, they may be optimal candidates for drug targets or 
protein therapeutics. Variants may have no activity at all and may thus serve as dominant 
negative natural inhibitors. Thus, splice variants useful in generating new drug targets, protein 
therapeutics and markers for diagnostics. 
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NOV-3 sequences according to the invention encode polypeptides related to STE20 
protein kinases, whose subgroups include GCK, SLK, and PSK proteins. Therefore, the 
nucleic acids and proteins of the invention can have similar functions as proteins belonging to 
these subgroups. 

5 Functional roles attributed to STE20 proteins include cytoskeletal organization, 

apoptosis, and signal transduction pathways. Thus, the NOV-3 nucleic acids and 
polypeptides, antibodies and related compounds according to the invention will be useful in 
therapeutic and diagnostic applications in disorders associated with, e.g., metaboUc and 
endocrine disorders, cancer, bone disorders, and tissue/cell growth regulation disorders. 

1 0 NOV-3 sequences were initially identified by searching CuraGen's Human SeqCalling 

database for DNA sequences that translate into proteins witti similarity to the STE20 protein 
kinase family. The SeqCalling assembly for NOV-3 was analyzed further to identify op6n 
reading frame(s) encoding for novel full length protein(s) and novel splice variants of these 
genes. This was done by extending the SeqCalling assembly using additional SeqCalling 

15 assemblies, publicly available EST sequences and public genomic sequence. Public ESTs and 
additional CuraGen SeqCalling assemblies were identified by the CuraTools program 
SeqExtend. They were included in the DNA sequence extension for SeqCalling assembly 
18552586 when extended overlaps were found. 

SeqCalling is a differential expression and sequencing procedure that normalizes 

20 mRNA species in a sample, and is disclosed in U.S. Ser. No. 09/417,386 filed October 13, 
1999, which is incorporated herein by reference in its entirety. 

A genomic clone of NOV-3 was analyzed by Genscan™ and Grail™ to identify exons 
and putative coding sequences/open reading firames. The NOV-3 clone was also analyzed by 
TblastN, BlastX and other homology programs to identify regions translating to proteins with 

25 similarity to the original protein/protein family of interest. 

The results of these analyses were integrated and manually corrected for apparent 
inconsistencies, thereby obtaining the sequences encoding the full-length proteins. When 
necessary, the process to identify and analyse cDNAs/ESTs and genomic clones was reiterated 
to derive the full-length sequence. The full-length DNA sequences as well as their spUce 

30 forms, and the full-length protein sequences that they encode, are disclosed herein. 
NOV-3 was mapped to chromosome 17. 

Based on the CuraGen SeqCalling database infomiation, the NOV-3 is expressed in 
heart tissue. Moreover, based on the expression of STE-20 family members, the following 
tissues are also likely to express the invention: brain (especially hippocampus and cerebral 
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cortex), prostate, and blood hematopoetic cell lines. The patterns of expression for this gene 
and its family members, combined with its similarity to the STE20 kinase family of genes, 
suggests that the NOV-3 proteins function as kinases in the tissues of expression. Thus, NOV- 
3 is implicated in disorders involving these tissues. Some of these disorders include: 
5 cardiovascular disorders, diabetes, leukemia/lymphoma, cancer, musculoskeletal disorders, 
muscular generation, reproductive health, metabolic and endocrine disorders, gastrointestinal 
disorders, immune and autoimmune disorders, respiratory disorders, bone disorders, and 
tissue/cell growth regulation disorders. 

Additional utilities for NOV-3 nucleic acids and polypeptides according to the 
10 invention are also disclosed herein. 



NOV-3a 

A NOV-3 a sequence according to the invention is a nucleic acid sequence encoding a 
polypeptide related to STE20 family of protein kinases. A disclosed N0V-3a nucleic acid and 
15 its encoded polypeptide includes the sequences shown in Table 10. The disclosed nucleic acid 
(SEQ ID NO: 6) is 3999 nucleotides in length and contains an open reading frame (ORF) that 
begins with an ATG initiation codon at nucleotides 1-3 and ends with a TGA stop codon at 
nucleotides 3996-3999, The start and stop codons are shown in bold font. The respective 
ORF encodes a 1332 amino acid polypeptide (SEQ ID NO: 7). 

20 

TABLE 10 

ATGGGCGACCCAGCCCCCGCCCGCAGCCTGGACGACATCGACCTGTCCGCCCTGCGGGACCCTGCTGGGATCTTTGAGCT 
TGTGGAGGTGGTCGGCAATGGAACCTACGGACAGGTGTACAAGGGTCGGCATGTCAAGACGGGGCAGCTGGCTGCCATCA 
AGGTCATGGATGTCACGGAGGACGAGGAGG7\AGAGATCAAACAGGAGATCAACATGCTGAAAAAGTACTCTCACCACCGC 

25 AJ^.CATCGCCACCTACTACGGAGCCTTCATCAAGAAGAGCCCCCCGGGAAACGATGACCAGCTCTGGCTGGTGATGGAGTT 
CTGTGGTGCTGGTTCAGTGACTGACCTGGTAAAGAACACAAAAGGCAACGCCCTGAAGGAGGACTGTATCGCCTATATCT 
GCAGGGAGATCCTCAGGGGTCTGGCCCATCTCCATGCCCACAAGGTGATCCATCGAGACATCAAGGGGCAGAATGTGCTG 
CTGACAGAGAATGCTGAGGTCi\AGCTAGTGGATTTTGGGGTGAGTGCTCAGCTGGACCGCACCGTGGGCAGACGGAACAC 
TTTCATTGGGACTCCCTACTGGATGGCTCCAGAGGTCATCGCCTGTGATGAGAACCCTGATGCCACCTATGATTACAGGA 

30 GTGATATTTGGTCTCTAGGAATCACAGCCATCGAGATGGCAGAGGGAGCCCCCCCTCTGTGTGACATGCACCCCATGCGA 
GCCCTCTTCCTCATTCCTCGGAACCCTCCGCCCAGGCTCAAGTCCAAGAAGTGGTCTAAGAAGTTCATTGACTTCATTGA 
CACATGTCTCATCAAGACTTACCTGAGCCGCCCACCCACGGAGCAGCTACTGAAGTTTCCCTTCATCCGGGACCAGCCCA 
CGGAGCGGCAGGTCCGCATCCAGCTTAAGGACCACATTGACCGATCCCGGAAGAAGCGGGGTGAGAAAGAGGAGACAGAA 
TATGAGTACAGCGGCAGCGAGGAGGAAGATGACAGCCATGGAGAGGAAGGAGAGCCAAGCTCCATCATGAACGTGCCTGG 

35 AGAGTCGACTCTACGCCGGGAGTTTCTCCGGCTCCAGCAGGAAAATAAGAGCAACTCAGAGGCTTTAAAACAGCAGCAGC 
AGCTGCAGCAGCAGCAGCAGCGAGACCCCGAGGCACACATCAAACACCTGCTGCACCAGCGGCAGCGGCGCATAGAGGAG 
CAGAAGGAGGAGCGGCGCCGCGTGGAGGAGCAACAGCGGCGGGAGCGGGAGCAGCGGAAGCTGCAGGAGAAGGAGCAGCA 
GCGGCGGCTGGAGGACATGCAGGCTCTGCGGCGGGAGGAGGAGCGGCGGCAGGCGGAGCGCGAGCAGGAATATATTCGTC 
ACAGGCTAGAGGAGGAGCAGCGACAGCTCGAGATCCTTCAGCAACAGCTGCTCCAGGAACAGGCCCTGCTGCTGGAATAC 
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AAGCGGAAGCAGCTGGAGGAGCAGCGGCAGTCAGAACGTCTCCAGAGGCAGCTGCAGCAGGAGCATGCCTACCTCAAGTC 
CCTGCAGCAGCAGCAACAGCAGCAGCAGCTTCAGAAACAGCAGCAGCAGCAGCTCCTGCCTGGGGACAGGAAGCCCCTGT 
ACCATTATGGTCGGGGCATGAATCCCGCTGACAAACCAGCCTGGGCCCGAGAGGTAGAAGAGAGAACAAGGATGAACAAG 
CAGCAGAACTCTCCCTTGGCCAAGAGCAAGCCAGGCAGCACGGGGCCTGAGCCCCCCATCCCCCAGGCCTCCCCAGGGCC 
5 CCCAGGACCCCTTTCCCAGACTCCTCCTATGCAGAGGCCGGTGGAGCCCCAGGAGGGACCGCACAAGAGCCTGGTGGCAC 
ACCGGGTCCCACTGAAGCCATATGCAGCACCTGTACCCCGATCCCAGTCCCTGCAGGACCAGCCCACCCGAAACCTGGCT 
GCCTTCCCAGCCTCCCATGACCCCGACCCTGCCATCCCCGCACCCACTGCCACGCCCAGTGCCCGAGGAGCTGTCATCCG 
CCAGAATTCAGACCCCACCTCTGAAGGACCTGGCCCCAGCCCG7UVTCCCCCAGCCTGGGTCCGCCCAGATAACGAGGCCC 
CACCCAAGGTGCCTCAGAGGACCTCATCTATCGCCACTGCCCTTAACACCAGTGGGGCCGGAGGGTCCCGGCCAGCCCAG 
10 GCAGTCCGTGCCAGTAACCCCGACCTCAGGAGGAGCGACCCTGGCTGGGAACGCTCGGACAGCGTCCTTCCAGCCTCTCA 
CGGGCACCTCCCCCAGGCTGGCTCACTGGAGCGGAACCGCGTGGGAGTCTCCTCCAAACCGGACAGCTCCCCTGTGCTCT 
CCCCTGGGAATAAAGCCAAGCCCGACGACCACCGCTCACGGCCAGGCCGGCCCGCAAGCTATAAGCGAGCAATTGGTGAG 
GACTTTGTGTTGCTGAAAGAGCGGACTCTGGACGAGGCCCCTCGGCCTCCCAAGAAGGCCATGGACTACTCGTCGTCCAG 
CGAGGAGGTGGAAAGCAGTGAGGACGACGAGGAGGAAGGCGAAGGCGGGCCAGCAGAGGGGAGCAGAGATACCCCTGGGG 

15 gccgcagcgatggggatacagacagcgtcagcaccatggtggtccacgacgtcgaggagatcaccgggacccagccccca 
tacgggggcggcaccatggtggtccagcgcacccctgaagaggagcggaacctgctgcatgctgacagcaatgg6tacac 
aaacctgcctgacgtggtccagcccagccactcacccaccgagaacagcaaaggccaaagcccaccctcgaaggatggga 
gtggtgactaccagtctcgtgggctggtaaaggcccctggcaagagctcgttcacgatgtttgtggatctagggatctac 
cagcctggaggcagtggggacagcatccccatcacagccctagtgggtggagagggcactcggctcgaccagctgcagta 

20 cgacgtgaggaagggttctgtggtcaacgtgaatcccaccaacacccgggcccacagtgagacccctgagatccggaagt 
acaagaagcgattcaactccgagatcctctgtgcagccctttggggggtcaacctgctggtgggcacggagaacgggctg 
atgttgctggaccgaagtgggcagggcaaggtgtatggactcattgggcggcgacgcttccagcagatggatgtgctgga 
ggggctcaacctgctcatcaccatctcagggaaaaggaacaaactgcgggtgtattacctgtcctggctccggaacaaga 
ttctgcacaatgacccagaagtggagaagaagcagggctggaccaccgtgggggacatggagggctgcgggcactaccgt 

25 gttgtgaaatacgagcggattaagttcctggtcatcgccctcaagagctccgtggaggtgtatgcctgggcccccaaacc 
ctaccacaaattcatggccttcaagtcctttgccgacctcccccaccgccctctgctggtcgacctgacagtagaggagg 
ggcagcggctcaaggtcatctatggctccagtgctggcttccatgctgtggatgtcgactcggggaacagctatgacatc 
tacatccctgtgcacatccagagccagatcacgccccatgccatcatcttcctccccaacaccgacggcatggagatgct 
gctgtgctacgaggacgagggtgtctacgtcaacacgtacgggcgcatcattaaggatgtggtgctgcagtggggggaga 

30 tgcctacttctgtggcctacatctgctccaaccagataatgggctggggtgagaaagccattgagatccgctctgtggag 
acgggccacctcgacggggtcttcatgcacaaacgagctcagaggctcaagttcctgtgtgagcggaatgacaaggtgtt 
ttttgcctcagtccgctctgggggcagcagccaagtttacttcatgactctgaaccgtaactgcatcatgj^ctggtga 
(SEQIDNO:6) 

35 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQIJ^IKVMDVTEDEEEEIKQEINMLKKYSHHR 
NIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNTKGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVL 
LTENAEVKLVDFGVSAQLDRTVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 
ALFLlPRNPPPRIiKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRIQLKDHIDRSRKKRGEKEETE 
YEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEE 

40 QKEERRRVEEQQRREREQRKLQEKEQQRRLEDMQALRREEERRQAEREQEYIRHRLEEEQRQLEILQQQLLQEQALLLEY 
KRKQLEEQRQSERLQRQLQQEHAYLKSLQQQQQQQQLQKQQQQQLLPGDRKPLYHYGRGMNPADKPAWAREVEERTRMNK 
QQNSPLAKSKPGSTGPEPPIPQASPGPPGPLSQTPPMQRPVEPQEGPHKSLVAHRVPLKPYAAPVPRSQSLQDQPTRNLA 
AFPASHDPDPAIPAPTATPSTUIGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 
AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRPASYKRAIGE 

45 DFVLLKERTLDEAPRPPKKAMDYSSSSEEVESSEDDEEEGEGGPAEGSRDTPGGRSDGDTDSVSTMWHDVEEITGTQPP 
YGGGTMWQRTPEEERNLLHADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 
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QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSVVNVNPTNTRAHSETPEIRKYKKRFNSEILCAALWGVNLI.VGTENGL 
MLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGKRNKLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYR 
WKYERIKFLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVE 
5 TGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVYFMTLNRNCIMNW (SEQ ID NO: 7) 

The disclosed NOV-3a nucleic acid sequence has homology (73% identity) to a mouse 
mRNA for a NIK protein (NIK) (GenBank Accession No: MMU88984), as shown in Table 
11. NIK proteins are a subgroup of the STE20 family of protein kinases. As indicated by the 

10 "Expect" value, the probability of this alignment occurring by chance alone is 4.3e-298, which 
is an incredibly low probability score. Moreover, the disclosed, encoded amino acid sequence 
has 1095 of 1332 amino acid residues (82%) identical to a human NIK-related protein 
(GenBank Accession No: BAA90753), as shown in Table 12. As indicated by the "Expect" 
value, the probability of this alignment occurring by chance alone is 0, the lowest probability 

15 score. 
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TABLE 11 

Score = 3892 {584.0 bits). Expect = 4.3e-298, Sum P(2) = 4.3e-29B 

Identities = 1224/1657 (73%), Positives = 1224/1657 (73%), Strand = Plus / 
Plus 



N0V3a : 

NIK : 

N0V3a : 
122 

NIK : 
122 

N0V3a: 
182 

NIK : 
179 

NOV3a : 
242 

NIK : 
239 

NOV3a : 
302 

NIK : 
299 



4 GGCGACCChGCC-CCCGCCCGCAGCCTGGACGAChTCGhCCTGTCCGCCCTGCGGGACCC 62 

GGCGA C A C CCCGC AGCCTGG GACAT GACCTGTC CCCTGCGGGACCC 
3 GGCGAACGACTCTCCCGCGAAGAGCCTGGTGGACATTGACCTGTCGTCCCTGCGGGACCC 62 

63 TGCTGGGATCTTTGAGCTTGTGGAGGTGGTCGGCAATGGAACCTACGGACAGGTGTACAA 

TGCTGGGAT TTTGAGCT GTGGA GTGGT GG AATGG ACCTA GGACA GT TA AA 
63 TGCTGGGATTTTTGAGCTGGTGGAAGTGGTTGGAAATGGCACCTATGGACAAGTCTATAA 



123 GGGTCGGCATGTCAAGACGGGGCAGCTGGCTGCCATCAAGGTCATGGATGTCACGGAGGA 



123 



GGGTCG CATGT AA ACGG CA CTG C GCCATCAAGGT ATGGA GTCAC GAGGA 
GGGTCGACATGTTAAAACGGT-CA-CTGCC-GCCATCAAGGTTATGGACGTCACCGAGGA 



183 CGAGGAGGAAGAGATCAAACAGGAGATCAACATGCTGAAAAAGTACTCTCACCACCGCAA 



180 



GA GAGGAAGA ATCA AC GGAGAT AA ATGCTGAA AAGTA TCTCA CA CG AA 
TGAAGAGGAAGAAATCACACTGGAGATAAATATGCTGAAGAAGTATTCTCATCATCGAAA 



243 CATCGCCACCTACTACGGAGCCTTCATCAAGAAGAGCCCCCCGGGAAACGATGACCAGCT 

AT GCCAC TACTA GG GC TTCAT AAGAAGAGCCC CC GGA A GATGACCA CT 
24 0 TATTGCCACGTACTATGGTGCTTTCATTAAGAAGAGCCCTCCAGGACATGATGACCAACT 
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303 CTGGCTGGTGATGGAGTTCTGTGGTGCTGGTTCAGTGACTGACCTGGTAAAGAACACAAA 
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NOV3a: 
362 

NIK : 
359 

NOV3a: 
422 

NIK : 
419 

NOV3a : 
482 

NIK : 
479 

NOV3a: 
542 

NIK : 
539 

N0V3a ; 
601 

NIK : 
598 



N0V3a : 
660 

NIK : 
657 

N0V3a: 
720 

NIK : 
717 

N0V3a : 
780 

NIK : 
777 

N0V3a: 
838 

NIK : 
835 

NOV3a: 
898 

NIK : 
895 

N0V3a: 
958 



CTGGCT GT ATGGAGTT TGTGG GCTGG TC T AC GACCT GT AAGAACAC AA 
300 CTGGCTTGTTATGGAGTTTTGTGGGGCTGGGTCCATCACAGACCTTGTGAAGAACACCAA 



363 AGGCAACGCCCTGAAGGAGGACTGTATCGCCTATATCTGCAGGGAGATCCTCAGGGGTCT 

AGG AAC C CT AA GA GACTG AT GC TA ATCT CAGGGA ATCCTCAGGGG T 
360 AGGGAACACTCTCAAAGAAGACTGGATTGCTTACATCTCCAGGGAAATCCTCAGGGGATT 



423 GGCCCATCTCCATGCCCACAAGGTGATCCATCGAGACATCAAGGGGCAGAATGTGCTGCT 

GGC CATCTCCAT CAC A GT AT CA CGAGA ATCAAGGG CA AATGTGCTGCT 
420 GGCACATCTCCATATTCACCACGTTATTCACCGAGATATCAAGGGCCAAAATGTGCTGCT 



4 83 GACAGAGAATGCTGAGGTCAAGCTAGTGGATTTTGGGGTGAGTGCTCAGCTGGACCGCAC 

GAG GAGAATGCTGAGGT AA CT GT GATTTTGG GT AG GCTCAGCTGGAC G'-AC 
4 80 GACCGAGAATGCTGAGGTGAAACTTGTTGATTTTGGTGTAAGCGCTCAGCTGGACAGGAC 



54 3 CGTGGG-CAGACGGAACACTTTCATTGGGACTCCCTACTGGATGGCTCCAGAGGTCATCG 

GT GG C GA G AA AC TTCAT GG AC CCCTACTGGATGGCTCCAGAGGTCATCG 
540 GGTTGGACGGA-GAAATACGTTCATAGGCACACCCTACTGGATGGCTCCAGAGGTCATCG 



602 CCTGTGATGAGAACCCTGATGCCACCTATGATTACAGGAGTGATATTTGGTCTCTA-GGA 

CCTGTGATGAGAACCC GA GCCAC TA GA TACAG AGTGA T TGGTC CT GG 
599 CCTGTGATGAGAACCCAGACGCCACTTACGACTACAGAAGTGACCTCTGGTC-CTGTGGC 



661 ATCACAGCCATCGAGATGGCAGAGGGAGCCCCCCCTCTGTGTGACATGCACCCCATGCGA 

ATCACAGCCATCGAGATGGC GA GG G CCCCCCTCT TGTGACATGCA CC ATG GA 
658 ATCACAGCCATCGAGATGGCTGAAGGGGGCCCCCCTCTCTGTGACATGCATCCAATGAGA 



721 GCCCTCTTCCTCATTCCTCGGAACCCTCCGCCCAGGCTCAAGTCCAAGAAGTGGTCTAAG 

GC CT TT CTCAT CC G AACCCTCC CCCAGGCT AAGTC AA AA TGGTC AAG 
718 GCGCTGTTTCTCATCCCCAGAAACCCTCCTCCCAGGCTGAAGTCAAAAAAATGGTCAAAG 



781 AAGTTCATTGA-CTTCATTGACACATGTCTCATCAAGACTTACCTG-AGCCGCCCACCCA 

AA TT TT A CTT AT GA TGTCT T AAGA TTAC TG AGC GCCC C A 
778 AAATTT-TTCAGCTTTATAGAAGGCTGTCTGGTGAAGAATTACATGCAGCGGCCCTCT-A 



839 CGGAGCAGCTACTGAAGTTTCCCTTCATCCGGGACCAGCCCACGGAGCGGCAGGTCCGCA 

C GAGCA CT T AA CC TTCAT GGGA CAGCCCA GA GGCAGGT CG A 
836 CAGAGCAACTTTT AAAACACCCTTTCATAAGGGATCAGCCCAATGAAAGGCAGGTTCGAA 



899 TCCAGCTTAAGGACCACATTGACCGATCCCGGAAGAAGCGGGGTGAGAAAGAGGAGACAG 
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50 



NIK : 
955 

N0V3a : 
1014 

NIK : 
1014 

NOV3a : 
1074 

NIK : 
1074 

NOV3a : 
1132 

NIK : 
1132 

NOV3a: 
1192 

NIK : 
1192 

NOVBa : 
1251 

NIK : 
1252 

NOV3a: 
1307 

NIK : 
1310 

N0V3a : 
1365 

NIK : 
1366 

NOV3a : 
1423 

NIK : 
1423 



TCCAGCTTAAGGA CACAT GACCG CC G AAGAAG G GG GAGAAAGA GAGAC G 
896 TCCAGCTTAAGGATCACATAGACCGGACCAGAAAGAAGAGAGGCGAGAAAGATGAGACGG 



959 AATATGAGTACAGCGGCAGCGAGGAGGAAGATGAC-A-GC-CATGGAG-AGGAAGGAGAG 

A TA GAGTACAGCGG AGCGAGGAGGA GA GA A G C TG AG AGGA GGAGAG 
956 AGT ACG AGTACAGCGGGAGCGAGG AGGAGGAGGAGGAAGTGCCTG-AGCAGGAGGGAGAG 



1015 CCAAGCTCCATCATGAACGTGCCTGGAGAGTCGACTCTACGCCGGGAGTTTCTCCGGCTC 

CCAAG TCCATC T AA GTGCCTGGAGAGTC ACTCT CG CG GA TT CT G CT 
1015 CCAAGTTCCATCGTCAATGTGCCTGGAGAGTCAACTCTGCGACGTGATTTCCTGAGACTG 



1075 CAGCAGGAAAATAAG-AGCAACTCAGAGGCTTTAAAACAG-CAGCAGCAGCTGCAGCAGC 

CAGCAGGA AA AAG AGC TC GAGGCT T AG CAGCAGC CTGCAG AGC 

1075 CAGCAGGAGAACAAGGAGCGG-TCTGAGGCTCTGCGG-AGACAGCAGCTTCTGCAGGAGC 



1133 AGCAGCAGCGAGACCCCGAGGCACACATCAAACACCTGCTGCACCAGCGGCAGCGGCGCA 

AGCAGC CG GA C GAGG AAA CA CTGCTG AG GGCAG GCG A 
1133 AGCAGCTCCGGGAGC AGGAGGAGT ATAAGAGGCAGCTGCTGGCTGAGAGGCAGAAGCGGA 



1193 T AGAGGAGCAGAAGGAGGAGCGGCGCCGCGTGGAGGAGCAACAGCGGCGGGAGCGGGA-G 

T GA AGCAGAA GA AG GG G CG TGGA GAGCAACA G G GA CGGGA G 
1193 TTGAACAGCAGAAAGAACAGAGGAGGCGGCTGGAAGAGCAACAAAGAAGAGAACGGGAAG 



1252 CAGCGGAAGCTGCAGGAGAAGGAGCAGCAGCGGCG-G — CTGGAGGACATGCAGGC-TCT 

C GGA GC GCAGGAG GAGCAGC GCGGCG G C GAGGA A G AGGC TCT 

1253 CCA-GGAGGCAGCAGGAGCGTGAGCAGCGGCGGCGTGAACAAGAGGAGAAG-AGGCGTCT 



1308 GCGGCGGGA-— GGAGGAGCGGCGGCAGGCGGAGCGCGAGCAGGAATATATTCGTCACAGG 

CG GG A GGA GCGGCG A G GAG GAG AGGA A C A AGG 
1311 -CGA- GGAACTGGAAAGGCGGCGTAAAGAAGAGGAAGAG- AGGAG- ACGGGCAGAAGAGG 



1366 CTA-GAGGAG-GAGCAGCGACAGCTCGAGATCCTTCAGCAACAGCTGCTCCAGGAACAGG 

A GAGGAG G G AG G AC GAG T C TCAG C GC GCT AGGA AG 

1367 AGAAGAGGAGAGTGGAGAGGGAACAGGAG-TACATCAGG — CGGCAGCTAGAGGAGGAGC 



55 



60 



NOV3a : 
1479 

NIK : 
1482 

NOV3a: 
1538 



1424 CCCTGCTGCTGGA-ATACA — AGCGGAAGCAGCTGGAGGAGCAGCGGCA-GTCAGAACGT 

C GC CTGGA AT C AGC G AGC GCT AGGAGCAG G CA GT A C 
1424 AGCGGCACCTGGAGATCCTGC AGCAGCAGCTGCTCCAGGAGCAG-GCCATGTTACTGCAC 

1480 CTCCAGAGGCAGCTGCA-GCAGGAGCATGCCTACCTCAAGTCCCTGCAGCAGCAGCAACA 
CCA AGG GC GCA GCA AGCA GC CC C G CCC GCAGCAGCAG A CA 
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NIK : 1483 GACCACAGGAGGCCGCACGCAC-AGCA-GCAG-CCGCC-GCCCCCGCAGCAGCAGGA-CA 
1537 

N0V3a : 1539 GCAGCAG — C-AGCTT-CA-GAAACAGCAGCAGCAGCAGCTCC-TG-CC-TGGGGACAGG 
5 1590 

G AGCA C AGCTT CA G CAG AGC AGC C C TG CC TG GACAG 
NIK : 1538 GGAGCAAACCGAGCTTTCATGCTCCAG-AGCCCAAGCCTCACTATGACCCTGCTGACAG- 
1595 

10 NOV3a : 1591 AAGCCCCTGTACCATTATGGTCGGGGCATGAATCCCGCT-GA-CAAAC-CAGCCTGGGCC 
1647 

AGC C G A TGGTC C G ATC C C GA C7^ C CC G C 
NIK : 1596 -AGCTCGGGAGGTACAGTGGTCCCACCTGGCATCTCTCAAGAACAATGTCTCCCCTGTCT 



15 



20 



60 



1654 

N0V3a: 1648 CGAGA 1652 (SEQ ID NO: 62) 
CGAGA 

NIK ; 1655 CGAGA 1659 (SEQ ID NO: 29) 



TABLE 12 



25 Score = 2104 bits (5451), Expect = 0.0 

Identities = 1095/1332 (82%), Positives = 1095/1332 (82%), Gaps = 37/1332 
(2%) 

N0V3a : 1 MGDPAPARSLDDIDLSALRDPAGIFELVEVVGNGTYGQVYKGRHVKTGQLAAIKVMDVTX 60 
30 . MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVT 

NIK : 1 MGDPAPARSLDDIDLSALRDPAGIFELVEVVGNGTYGQVYKGRHVKTGQLAAIKVMDVTE 60 

NOV3a : 61 XXXXXIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 
IKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 
35 NIK : 61 DEEEEIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 

N0V3a: 121 KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 

KGNTUiKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 
NIK : 121 KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 

N0V3a : 181 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 24 0 

TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 
NIK : 181 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 240 

45 N0V3a: 241 ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 300 

ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 
NIK : 241 ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 300 

NOV3a: 301 QLKDHIXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXPSSIMNVPGESTLRREFLRLQQ 360 
50 QLKDHI PSSIMNVPGESTLRREFLRLQQ 

NIK : 301 QLKDHIDRSRKKRGEKEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQ 360 

N0V3a : 361 ENKSNSEALKXXXXXXXXXXRDPEAHIKHLLHXXXXXXXXXXXXXXXXXXXXXXXXXXXX 420 
ENKSNSEALK RDPEAHIKHLLH 
55 NIK ; 361 ENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEEQKEERRRVEEQQRREREQRK 420 

N0V3a: 421 XXXXXXXXXXXDMQALXXXXXXXXXXXXXXYIRHRXXXXXXXXXXXXXXXXXXXXXXXXY 480 

DMQAL Y 
NIK : 421 LQEKEQQRRLEDMQAL RREEERRQAEREQEY 451 



N0V3a: 481 KRKXXXXXXXXXXXXXXXXXXHAYLKSXXXXXXXXXXXXXXXXXXXPGDRKPLYHYGRGM 540 
KRK HAYLKS PGDRKPLYHYGRGM 
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KRKQLEEQRQSERLQRQLQQEHAYLKSLQQQQQQQQLQKQQQQQLLPGDRKPLYHYGRGM 

NPADKPAWAREVEERTRMNKQQNSPLAKSKPGSTXXXXXXXXXXXXXXXXXXXXXXMQRP 
NPADKPAWAREVEERTRMNKQQNSPLAKSKPGST MQRP 
NPADKPAWAREVEERTRMNKQQNSPLAKSKPGSTGPEPPIPQASPGPPGPLSQTPPMQRP 

VEPQEGPHKSLVAHRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASHXXXXXXXXXXXXXX 
VEPQEGPHKSLVAHRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASH • 
VEPQEGPHKSLVMRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASHDPDPAIPAPTATPS 

XRGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 
RGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 
ARGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPECVPQRTSSIATALNTSGAGGSRPAQ 

AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 
AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 
AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 

P DDHRS RPGRPAS YKPAI GE DF\n:XKERTLDEAPRP PKKJ^JylDYXXXXXXXXXXXXXXXXX 
PDDHRSRPGRPA DFVLLKERTLDEAPRPPKKAMDY 

PDDHRSRPGRPA DFVLLKERTLDEAPRPPKKAMDYSSSSEEVESSEDDEEEG 

XXXXXXXXRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMVVQRTPEEERNLLH 

RDTPGGRSDGDTDSVSTMVVHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 
EGGPAEGSRDTPGGRSDGDTDSVSTbW^HDVEEITGTQPPYGGGTMVVQRTPEEERNLLH 

ADSNGYTNLPDVVQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 
ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 
ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 

QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 

QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSVVNVNPTNTRAHSETPEIRKYKKRFNS 
QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 

EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGKRN 

EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGKRN 



N0V3a : 1081 KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRVVKYERIKFLVIALKSSVEV 
1140 

KLRYYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRVVKYERIKFLVIALKSSVEV 
NIK : 104 4 KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRWKYERIKFLVIALKSSVEV 
1103 

NOVSa : 1141 YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
1200 

YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
NIK : 1104 YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
1163 

N0V3a : 1201 YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 
1260 

YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 
NIK : 1164 YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 
1223 

N0V3a : 1261 ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 
1320 

ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 
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NIK : 


452 


N0V3a : 


541 


NIK : 


512 


N0V3a : 


601 


NIK : 


572 


N0V3a : 


661 


NIK : 


632 


N0V3a : 


721 


NIK : 


692 


N0V3a: 


781 


NIK : 


752 


N0V3a : 


841 


NIK : 


804 


N0V3a: 


901 


NIK : 


864 


iNvj V oa . 


^ O JL 


1020 




N J.i\ : 




NOVSa: 


1021 


1080 




NIK : 


984 


1043 
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NIK : 1224 ICSNQIMGWGEECAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 
1283 

N0V3a: 1321 FMTLNRNCIMNW 1332 (SEQ ID NO: 63) 
5 FMTLNRNCIMNW 

NIK : 1284 FMTLNRNCIMNW 1295 (SEQ ID NO: 30) 

Based on its relatedness to known members of the STE20 family of protein kinases, 
NOVBa provides new diagnostic and therapeutic compositions useful in the treatment of 
10 disorders associated with alterations in the expression of members of the STE20 family of 

protein kinases. Nucleic acids, polypeptides, antibodies, and other compositions of the present 
invention are useful in the treatment and diagnosis of a variety of diseases and pathologies, 
including, by way of nonlimiting example, those involving metabolic and endocrine disorders, 
cancer, bone disorders, and tissue/cell growth regulation disorders. 

15 

NOV.3b 

A NO V-3b sequence according to the invention is a nucleic acid sequence encoding a 
polypeptide related to STE20 family of protein kinases. A disclosed NOV-3b nucleic acid and 
its encoded polypeptide includes the sequences shown in Table 13. The disclosed nucleic acid 
20 (SEQ ID NO: 8) is 3912 nucleotides in length and contains an open reading frame (ORE) that 
begins with an ATG initiation codoii at nucleotides 1-3 and ends with a TGA stop codon at 
nucleotides 3910-3912. The start and stop codons are shown in bold font. The respective 
ORE encodes a 1303 amino acid polypeptide (SEQ ID NO: 9). 

25 TABLE 13 

ATGGGCGACCCAGCCCCCGCCCGCAGCCTGGACGACATCGACCTGTCCGCCCTGCGGGACCCTGCTGGGATCTTTGAGCT 
TGTGGAGGTGGTCGGCAATGGAACCTACGGACAGGTGTACAAGGGTCGGCATGTCAAGACGGGGCAGCTGGCTGCCATCA 
AGGTCATGGATGTCACGGAGGACGAGGAGGAAGAGATCAAACAGGAGATCAACATGCTGAAAAAGTACTCTCACCACCGC 
AACATCGCCACCTACTACGGAGCCTTCATCAAGAAGAGCCCCCCGGGAAACGATGACCAGCTCTGGCTGGTGATGGAGTT 

30 CTGTGGTGCTGGTTCAGTGACTGACCTGGTAAAGAACACAAAAGGCAACGCCCTGAAGGAGGACTGTATCGCCTATATCT 
GCAGGGAGATCCTCAGGGGTCTGGCCCATCTCCATGCCCACAAGGTGATCCATCGAGACATCAAGGGGCAGAATGTGCTG 
CTGACAGAGAATGCTGAGGTCAAGCTAGTGGATTTTGGGGTGAGTGCTCAGCTGGACCGCACCGTGGGCAGACGGAACAC 
TTTCATTGGGACTCCCTACTGGATGGCTCCAGAGGTCATCGCCTGTGATGAGAACCCTGATGCCACCTATGATTACAGGA 
GTGATATTTGGTCTCTAGGAATCACAGCCATCGAGATGGCAGAGGGAGCCCCCCCTCTGTGTGACATGCACCCCATGCGA 

35 GCCCTCTTCCTCATTCCTCGGAACCCTCCGCCCAGGCTCAAGTCCAAGAAGTGGTCTAAGAAGTTCATTGACTTCATTGA ■ 

CACATGTCTCATCAAGACTTACCTGAGCCGCCCACCCACGGAGCAGCTACTGAAGTTTCCCTTCATCCGGGACCAGCCCA 
CGGAGCGGCAGGTCCGCATCCAGCTTAAGGACCACATTGACCGATCCCGGAAGAAGCGGGGTGAGAAAGAGGAGACAGAA 
TATGAGTACAGCGGCAGCGAGGAGGAAGATGACAGCCATGGAGAGGAAGGAGAGCCAAGCTCCATCATGAACGTGCCTGG 
AGAGTCGACTCTACGCCGGGAGTTTCTCCGGCTCCAGCAGGAAAATAAGAGCAACTCAGAGGCTTTAAAACAGCAGCAGC 

40 

AGCTGCAGCAGCAGCAGCAGCGAGACCCCGAGGCACACATCAAACACCTGCTGCACCAGCGGCAGCGGCGCATAGAGGAG 
CAGAAGGAGGAGCGGCGCCGCGTGGAGGAGCAACAGCGGCGGGAGCGGGAGCAGCGGAAGCTGCAGGAGAAGGAGCAGCA 
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GCGGCGGCTGGAGGACATGCAGGCTCTGCGGCGGGAGGAGGAGCGGCGGCAGGCGGAGCGCGAGCAGGAATATATTCGTC 
ACAGGCTAGAGGAGCAGCGGCAGTCAGAACGTCTCCAGAGGCAGCTGCAGCAGGAGCATGCCTACCTCAAGTCCCTGCAG 
CAGCAGCAACAGCAGCAGCAGCTTCAGAAACAGCAGCAGCAGCAGCTCCTGCCTGGGGACAGGAAGCCCCTGTACCATTA 
TGGTCGGGGCATGAATCCCGCTGACAAACCAGCCTGGGCCCGAGAGGTAGAAGAGAGAACAAGGATGAACAAGCAGCAGA 
5 ACTCTCCCTTGGCCAAGAGCAAGCCAGGCAGCACGGGGCCTGAGCCCCCCATCCCCCAGGCCTCCCCAGGGCCCCCAGGA 
CCCCTTTCCCAGACTCCTCCTATGCAGAGGCCGGTGGAGCCCCAGGAGGGACCGCACAAGAGCCTGGTGGCACACCGGGT 
CCCACTGAAGCCATATGCAGCACCTGTACCCCGATCCCAGTCCCTGCAGGACCAGCCCACCCGT^CCTGGCTGCCTTCC 
CAGCCTCCCATGACCCCGACCCTGCCATCCCCGCACCCACTGCCACGCCCAGTGCCCGAGGAGCTGTCATCCGCCAGAAT 
TCAGACCCCACCTCTGAAGGACCTGGCCCCAGCCCGAATCCCCCAGCCTGGGTCCGCCCAGATAACGAGGCCCCACCCAA 

10 GGTGCCTCAGAGGACCTCATCTATCGCCACTGCCCTTAACACCAGTGGGGCCGGAGGGTCCCGGCCAGCCCAGGCAGTCC 
GTGCCAGTAACCCCGACCTCAGGAGGAGCGACCCTGGCTGGGAACGCTCGGACAGCGTCCTTCCAGCCTCTCACGGGCAC 
CTCCCCCAGGCTGGCTCACTGGAGCGGAACCGCGTGGGAGTCTCCTCCAAACCGGACAGCTCCCCTGTGCTCTCCCCTGG 
GAATAAAGCCAAGCCCGACGACCACCGCTCACGGCCAGGCCGGCCCGCAAGCTATAAGCGAGCAATTGGTGAGGACTTTG 
TGTTGCTGAAAGAGCGGACTCTGGACGAGGCCCCTCGGCCTCCCAAGAAGGCCATGGACTACTCGTCGTCCAGCGAGGAG 

15 GTGGAAAGCAGTGAGGACGACGAGGAGGAAGGCGAAGGCGGGCCAGCAGAGGGGAGCAGAGATACCCCTGGGGGCCGCAG 
CGATGGGGATACAGACAGCGTCAGCACCATGGTGGTCCACGACGTCGAGGAGATCACCGGGACCCAGCCCCCATAtGGGG 
GCGGCACCATGGTGGTCCAGCGCACCCCTGAAGAGGAGCGGAACCTGCTGCATGCTGACAGCAATGGGTACACAAACCTG 
CCTGACGTGGTCCAGCCCAGCCACTCACCCACCGAGAACAGCAAAGGCCAAAGCCCACCCTCGAAGGATGGGAGTGGTGA 
CTACCAGTCTCGTGGGCTGGTAAAGGCCCCTGGCAAGAGCTCGTTCACGATGTTTGTGGATCTAGGGATCTACCAGCCTG 

20 GAGGCAGTGGGGACAGCATCCCCATCACAGCCCTAGTGGGTGGAGAGGGCACTCGGCTCGACCAGCTGCAGTACGACGTG 
AGGAAGGGTTCTGTGGTCAACGTGAATCCCACCAACACCCGGGCCCACAGTGAGACCCCTGAGAXCCGGAAGTACAAGAA 
GCGATTCAACTCCGAGATCCTCTGTGCAGCCCTTTGGGGGGTCAACCTGCTGGTGGGCACGGAGAACGGGCTGATGTTGC 
TGGACCGAAGTGGGCAGGGCAAGGTGTATGGACTCATTGGGCGGCGACGCTTCCAGCAGATGGATGTGCTGGAGGGGCTC 
AACCTGCTCATCACCATCTCAGGGAAAAGGAACAAACTGCGGGTGTATTACCTGTCCTGGCTCCGGAACAAGATTCTGCA 

25 CAATGACCCAGAAGTGGAGAAGAAGCAGGGCTGGACCACCGTGGGGGACATGGAGGGCTGCGGGCACTACCGTGTTGTGA 
AATACGAGCGGATTAAGTTCCTGGTCATCGCCCTCAAGAGCTCCGTGGAGGTGTATGCCTGGGCCCCCAAACCCTACCAC 
AAATTCATGGCCTTCAAGTCCTTTGCCGACCTCCCCCACCGCCCTCTGCTGGTCGACCTGACAGTAGAGGAGGGGCAGCG 
GCTCAAGGTCATCTATGGCTCCAGTGCTGGCTTCCATGCTGTGGATGTCGACTCGGGGAACAGCTATGACATCTACATCC 
CTGTGCACATCCAGAGCCAGATCACGCCCCATGCCATCATCTTCCTCCCCAACACCGACGGCATGGAGATGCTGCTGTGC 

30 TACGAGGACGAGGGTGTCTACGTCAACACGTACGGGCGCATCATTAAGGATGTGGTGCTGCAGTGGGGGGAGATGCCTAC 
TTCTGTGGCCTACATCTGCTCCAACCAGATAATGGGCTGGGGTGAGAAAGCCATTGAGATCCGCTCTGTGGAGACGGGCC 
ACCTCGACGGGGTCTTCATGCACAAACGAGCTCAGAGGCTCAAGTTCCTGTGTGAGCGGAATGACAAGGTGTTTTTTGCC 

TCAGTCCGCTCTGGGGGCAGCAGCCAAGTTTACTTCATGACTCTGAACCGTAACTGCATCATGAACTGGTGA (SEQID 



35 



NO: 8) 



MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTEDEEEEIKQEINMLKKYSHHR 
NIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLViCNTKGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVL 
LTENAEVKLVDFGVSAQLDRTVGRRNTFIGTPYV3MAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 
ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRIQLKDHIDRSRKKRGEKEETE 

40 YEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEE 
QKEERRRVEEQQRREREQRKLQEKEQQRRLEDMQALRREEERRQAEREQEYIRHRLEEQRQSERLQRQLQQEHAYLKSLQ 
QQQQQQQLQKQQQQQLLPGDRKPLYHYGRGMNPADKPAWAREVEERTRMNKQQNSPLAKSKPGSTGPEPPIPQASPGPPG 
PLSQTPPMQRPVEPQEGPHKSLVAHRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASHDPDPAIPAPTATPSARGAVIRQN 
SDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQAVRASNPDLRRSDPGWERSDSVLPASHGH 

45 LPQAGSLERNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRPASYKRAIGEDFVLLKERTLDEAPRPPKKAMDYSSSSEE 
VESSEDDEEEGEGGPAEGSRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLHADSNGYTNL 
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. PDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIYQPGGSGDSIPITALVGGEGTRLDQLQYDV 
RKGSVVNVNPTNTRAHSETPEIRKYKKRFNSEILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRra 
NLLITISGKRNKLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRVVKYERlKFLVlkLKSSVEVYAWAPKPYH 
KFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQITPHAIIFLPNTDGMEMLLC 
5 YEDEGVYVNTYGRIIKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFA 
SVRSGGSSQVYFMTLNRNCIMNW (SEQ ID NO: 9) 

The disclosed NOV-3b nucleic acid sequence has homology (75% identity) to a mouse 
mRNA for a NIK protein (NIK) (GenBank Accession No: MMU88984), as shown in Table 

10 14. NIK proteins are a subgroup of the STE20 family of protein kinases. As indicated by the 
"Expect" value, the probability of this alignment occurring by chance alone is 3.3e-295, which 
is an incredibly low probability score. Moreover, the disclosed, encoded amino acid sequence 
has 1093 of 1303 amino acid residues (83%) identical to a hmnan NIK-related protein 
(GenBank Accession No: BAA90753), as shown in Table 15. As indicated by the "Expect" 

15 value, the probability of this alignment occurring by chance alone is 0.0, the lowest probability 
score. 
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TABLE 14 

Score = 3828 (574.4 bits)/ Expect = 3.3e-295, Sum P(2) = 3.3e-295 

Identities = 1128/1488 (75%), Positives = 1128/1488 (75%), Strand 
Plus 



Plus / 



25 



30 



35 



NOVSb: 

NIK : 

NOVSb : 
122 

NIK : 
122 

NOVSb : 
182 

NIK : 
179 



4 GGCGACCCAGCC-CCCGCCCGCAGCCTGGACGACATCGACCTGTCCGCCCTGCGGGACCC 62 
GGCGA C A C CCCGC AGCCTGG GACAT GACCTGTC CCCTGCGGGACCC 

5 GGCGAACGACTCTCCCGCGAAGAGCCTGGTGGACATTGACCTGTCGTCCCTGCGGGACCC 62 

63 TGCTGGGATCTTTGAGCTTGTGGAGGTGGTCGGCAATGGAACCTACGGACAGGTGTACAA 

TGCTGGGAT TTTGAGCT GTGGA GTGGT GG AATGG ACCTA GGACA GT TA AA 
63 TGCTGGGATTTTTGAGCTGGTGGAAGTGGTTGGAAATGGCACCTATGGACAAGTCTATAA 



123 GGGTCGGCATGTCAAGACGGGGCAGCTGGCTGCCATCAAGGTCATGGATGTCACGGAGGA 

GGGTCG CATGT AA ACGG CA CTG C GCCATCAAGGT ATGGA GTCAC GAGGA 
123 GGGTCGACATGTTAAAACGGT-CA-CTGCC-GCCATCAAGGTTATGGACGTCACCGAGGA 



40 



45 
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NOVSb : 
242 

NIK : 
239 

NOVSb : 
302 

NIK : 
299 



183 CGAGGAGGAAGAGATCATVACAGGAGATCAACATGCTGAAAAAGTACTCTCACCACCGCT^ 

GA GAGGAAGA ATCA AC GGAGAT AA ATGCTGAA AAGTA TCTCA CA CG AA 
180 TGAAGAGGAAGAAATCACACTGGAGATAAATATGCTGAAGAAGTATTCTCATCATCGAAA 



243 CATCGCCACCTACTACGGAGCCTTCATCAAGAAGAGCCCCCCGGGAAACGATGACCAGCT 

AT GCCAC TACTA GG GC TTCAT AAGAAGAGCCC CC GGA A GATGACCA CT 
24 0 TATTGCCACGTACTATGGTGCTTTCATTAAGAAGAGCCCTCCAGGACATGATGACCAACT 
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NOV3b : 303 CTGGCTGGTGATGGAGTTCTGTGGTGCTGGTTCAGTGACTGACCTGGTAAAGAACACAAA 
362 

CTGGCT GT ATGGAGTT TGTGG GCTGG TC T AC GACCT GT AAGAACAC AA 
NIK : 300 CTGGCTTGTTATGGAGTTTTGTGGGGCTGGGTCCATCACAGACCTTGTGAAGAACACCAA 
359 

N0V3b : 363 AGGCAACGCCCTGAAGGAGGACTGTATCGCCTATATCTGCAGGGAGATCCTCAGGGGTCT 
422 

AGG AAC C CT AA GA GACTG AT GC TA ATCT CAGGGA ATCCTCAGGGG T 
NIK : 360 AGGGAACACTCTCAAAGAAGACTGGATTGCTTACATCTCCAGGGAAATCCTCAGGGGATT 
419 

NOV3b : 423 GGCCCATCTCCATGCCCACAAGGTGATCCATCGAGACATCAAGGGGCAGAATGTGCTGCT 
482 

GGC CATCTCCAT CAC A GT AT CA CGAGA ATCAAGGG CA AATGTGCTGCT 
NIK : 420 GGCACATCTCCATATTCACCACGTTATTCACCGAGATATCT^GGGCCAAAATGTGCTGCT 
479 

NO V3b : 483 GACAGAGAATGCTGAGGTCAAGCTAGTGGATTTTGGGGTGAGTGCTCAGCTGGACCGCAC 
542 

GAG GAGAATGCTGAGGT AA CT GT GATTTTGG GT AG GCTCAGCTGGAC G-'AC 
NIK : 480 GACCGAGAATGCTGAGGTGAAACTTGTTGATTTTGGTGTAAGCGCTCAGCTGGACAGGAC 
539 

NOV3b : 543 CGTGGG-CAGACGGAACACTTTCATTGGGACTCCCTACTGGATGGCTCCAGAGGTCATCG 
601 

GT GG C GA G AA AC TTCAT GG AC CCCTACTGGATGGCTCCAGAGGTCATCG 
NIK : 540 GGTTGGACGGA-GAAATACGTTCATAGGCACACCCTACTGGATGGCTCCAGAGGTCATCG 
598 



N0V3b : 602 CCTGTGATGAGAACCCTGATGCCACCTATGATTACAGGAGTGATATTTGGTCTCTA-GGA 
660 

CCTGTGATGAGAACCC GA GCCAC TA GA TACAG AGTGA T TGGTC CT GG 
NIK : 599 CCTGTGATGAGAACCCAGACGCCACTTACGACTACAGAAGTGACCTCTGGTC-CTGTGGC 
657 

N0V3b : 661 ATCACAGCCATCGAGATGGCAGAGGGAGCCCCCCCTCTGTGTGACATGCACCCCATGCGA 
720 

ATCACAGCCATCGAGATGGC GA GG G CCCCCCTCT TGTGACATGCA CC ATG GA 
NIK : 658 ATCACAGCCATCGAGATGGCTGAAGGGGGCCCCCCTCTCTGTGACATGCATCCAATGAGA 
717 

N0V3b : 721 GCCCTCTTCCTCATTCCTCGGAACCCTCCGCCCAGGCTCAAGTCCAAGAAGTGGTCTAAG 
780 

GC CT TT CTCAT CC G AACCCTCC CCCAGGCT AAGTC AA AA TGGTC AAG 
NIK : 718 GCGCTGTTTCTCATCCCCAGAAACCCTCCTCCCAGGCTGAAGTCAAAAAAATGGTCAAAG 
777 

N0V3b : 781 AAGTTCATTGA-CTTCATTGACACATGTCTCATCAAGACTTACCTG-AGCCGCCCACCCA 
838 

AA TT TT A CTT AT GA TGTCT T AAGA TTAC TG AGC GCCC C A - 
NIK : 778 AAATTT-TTCAGCTTTATAGAAGGCTGTCTGGTGAAGAATTACATGCAGCGGCCCTCT-A 
835 

NOV 3b : 839 CGGAGCAGCTACTGAAGTTTCCCTTCATCCGGGACCAGCCCACGGAGCGGCAGGTCCGCA 
898 

C GAGCA CT T .AA CC TTCAT GGGA CAGCCCA GA GGCAGGT CG A 
NIK : 836 CAGAGCAACTTTTAAAACACCCTTTCATAAGGGATCAGCCCAATGAAAGGCAGGTTCGAA 
895 

NOV3b : 899 TCCAGCTTAAGGACCACATTGACCGATCCCGGAAGAAGCGGGGTGAGAAAGAGGAGACAG 
958 
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NIK 
955 



TCCAGCTTAAGGA CACAT GACCG CC G AAGAAG G GG GAGAAAGA GAGAC G 
896 TCCAGCTTAAGGATCACATAGACCGGACCAGAAAGAAGAGAGGCGAGAAAGATGAGACGG 



10 



15 



N0V3b : 
1014 

NIK : 
1014 

N0V3b: 
1074 

NIK : 
1074 



959 AATATGAGTACAGCGGCAGCGAGGAGGAAGATGAC-A-GC-CATGGAG-AGGAAGGAGAG 

A TA GAGTACAGCGG AGCGAGGAGGA GA GA AG C TG AG AGGA GGAGAG 
956 AGTACGAGTACAGCGGGAGCGAGGAGGAGGAGGAGGAAGTGCCTG-AGCAGGAGGGAGAG 



1015 CCAAGCTCCATCATGAACGTGCCTGGAGAGTCGACTCTACGCCGGGAGTTTCTCCGGCTC 

CCAAG TCCATC T AA GTGCCTGGAGAGTC ACTCT CG CG GA TT CT G CT 
1015 CCAAGTTCCATCGTCAATGTGCCTGGAGAGTCAACTCTGCGACGTGATTTCCTGAGACTG 
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NOV3b : 
1132 

NIK : 
1132 



1075 CAGCAGGAAAATAAG-AGCAACTCAGAGGCTTTAAAACAG-CAGCAGCAGCTGCAGCAGC 

CAGCAGGA AA AAG AGO TO GAGGCT T AG CAGCAGC CTGCAG AGC 

1075 CAGCAGGAGAACAAGGAGCGG-TCTGAGGCTCTGCGG-AGACAGCAGCTTCTGCAGGAGC 
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N0V3b: 
1192 

NIK : 
1192 



1133 AGCAGCAGCGAGACCCCGAGGCACACATCAAACACCTGCTGCACCAGCGGCAGCGGCGCA 

AGCAGC CG GA C GAGG A A A CA CTGCTG AG GGCAG GCG A 
1133 AGCAGCTCCGGGAGCAGGAGGAGTATAAGAGGCAGCTGCTGGCTGAGAGGCAGAAGCGGA 



30 
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NOV3b : 
1251 

NIK : 
1252 

NOV3b: 
1307 

NIK : 
1310 

N0V3b: 
1365 

NIK : 
1366 



1193 TAGAGGAGCAGAAGGAGGAGCGGCGCCGCGTGGAGGAGCAACAGCGGCGGGAGCGGGA-G 

T GA AGCAGAA GA AG GG G CG TGGA GAGCAACA G G GA CGGGA G 
1193 TTGAACAGCAGAAAGAACAGAGGAGGCGGCTGGAAGAGCAACAAAGAAGAGAACGGGAAG 



1252 CAGCGGAAGCTGCAGGAGAAGGAGCAGCAGCGGCG-G — CTGGAGGACATGCAGGC-TCT 

C GGA GC GCAGGAG GAGCAGC GCGGCG G C GAGGA A G AGGC TCT 

1253 CCA-GGAGGCAGCAGGAGCGTGAGCAGCGGCGGCGTGAACAAGAGGAGAAG-AGGCGTCT 



1308 GCGGCGGGA~GGAGGAGCGGCGGCAGGCGGAGCGCGAGCAGGAATATATTCGTC ACAGG 

CG GG A GGA GCGGCG A G GAG GAG AGGA A C A AGG 
1311 -CGA-GGAACTGGAAAGGCGGCGTAAAGAAGAGGAAGAG-AGGAG- ACGGGCAGAAGAGG 



50 



N0V3b : 
1418 

NIK : 
1424 



1366 CTA-GAGGAGCAGCGGC-AGT CAGAACGT-CTCCAGA-GGCAGCTGCAGCAGGAGCA 

A GAGGAG AG GG AG CAG A GT C CAG GGCAGCT AG AGGAGCA 

1367 AGAAGAGGAG- AGTGGAGAGGGAACAGGA-GTACATCAGGCGGCAGCTAGAGGAGGAGCA 
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N0V3b: 
1476 

NIK : 
1482 



1419 T-GCCTACCTCAAGTCCCTGCAGCAGCAGCAACAGCAGCAGCAGCTTCA-GAAACAGCAG 

G C ACCT AG CCTGCAGCAGCAGC C CAG AGCAG CA G AC GCA 
1425 GCGGC-ACCTGGAGATCCTGCAGCAGCAGCTGCTCC AGGAGCAGGC-CATGTTACTGCAC 



NOV3b: 1477 CAGCAGCAG 1485 (SEQ ID NO: 64) 
60 A CA CAG 

NIK : 1483 GACCA-CAG 1490 (SEQ ID NO: 31) 
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TABLE IS 

Score 2114 bits (5478), Expect = 0.0 

Identities = 1093/1303 (83%), Positives = 1093/1303 (83%), Gaps = 8/1303 
(0%) 

5 

N0V3b: 1 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTX 60 

MGDPAPARSLDDI DLSALRDPAGI FELVEWGNGTYGQVYKGRHVKTG.QLAAIKVMDVT 
NIK : 1 MGDPAPARSLDDI DLSALRDPAGI FELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTE 60 

10 NOV3b: 61 XXXXXIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 

IKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 
NIK : 61 DEEEEIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 

NOVBb: 121 KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 
15 KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 

NIK : 121 KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 

N0V3b : 181 TVGRRinTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 240 
TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 
20 NIK : 181 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPtflR 240 

NOV3b: 241 ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 300 

ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 
NIK : 241 ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 300 

25 

N0V3b: 301 QLKDHIXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXPSSIMNVPGESTLRREFLRLQQ 360 

QLKDH I PS S I MN VPGES TLRRE FLRLQQ 

NIK : 301 QLKDHIDRSRKKRGEKEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQ 360 

30 N0V3b: 361 ENKSNSEALKXXXXXXXXXXRDPEAHIKHLLHXXXXXXXXXXXXXXXXXXXXXXXXXXXX 4 20 

ENKSNSEALK RDPEAHIKHLLH 
NIK : 361 ENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEEQKEERRRVEEQQRREREQRK 420 

N0V3b: 4 21 XXXXXXXXXXXDMQALXXXXXXXXXXXXXXYIRHXXXXXXXXXXXXXXXXXXHAYLKSXX 4 80 
35 DMQAL Y R HAYLKS 

NIK : 421 LQEKEQQRRLEDMQALRREEERRQAEREQEYKRKQLEEQRQSERLQRQLQQEHAYLKSLQ 4 80 

NOV3b : 481 XXXXXXXXXXXXXXXXXPGDRKPLYHYGRGMNPADKPAWAREVEERTRMNKQQNSPLAKS 540 

PGDRKPLYHYGRGMNPADKPAWAREVEERTRMNKQQNSPLAKS 
40 NIK : 481 QQQQQQQLQKQQQQQLLPGDRKPLYHYGRGMNPADKPAWAREVEERTRMNKQQNSPLAKS 540 

N0V3b: 541 KPGSTXXXXXXXXXXXXXXXXXXXXXXMQRPVEPQEGPHKSLVAHRVPLKPYAAPVPRSQ 600 

KPGST MQRPVEPQEGPHKSLVAKRVPLKPYAAPVPRSQ 
NIK : 541 KPGSTGPEPPIPQASPGPPGPLSQTPPMQRPVEPQEGPHKSLVAHRVPLKPYAAPVPRSQ 600 

45 

NOV3b : 601 SLQDQPTRNLAAFPASHXXXXXXXXXXXXXXXRGAVIRQNSDPTSEGPGPSPNPPAWVRP 660 

SLQDQPTRNLAAFPASH RGAVIRQNSDPTSEGPGPSPNPPAWVRP 
NIK ; 601 SLQDQPTRNLAAFPASHDPDPAIPAPTATPSARGAVIRQNSDPTSEGPGPSPNPPAWVRP 660 

50 NOV3b; 661 DNEAPPKVPQRTSSIATALNTSGAGGSRPAQAVRASNPDLRRSDPGWERSDSVLPASHGH 720 

DNEAPPKVPQRTSSIATALNTSGAGGSRPAQAVRASNPDLRRSDPGWERSDSVLPASHGH 
NIK : 661 DNEAPPKVPQRTSSIATALNTSGAGGSRPAQAVRASNPDLRRSDPGWERSDSVLPASHGH 720 

NOV3b: 721 LPQAGSLERNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRPASYKRAIGEDFVLLKERT 780 
55 LPQAGSLERNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRPA DFVLLKERT 

NIK : 721 LPQAGSLERNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRPA DFVLLKERT 772 

N0V3b: 781 LDEAPRPPKKAMDYXXXXXXXXXXXXXXXXXXXXXXXXXRDTPGGRSDGDTDSVSTMWH 84 0 
LDEAPRPPKKAMDY RDTPGGRSDGDTDSVSTMVVH 
60 NIK : 773 LDEAPRPPKKAMDYSSSSEEVESSEDDEEEGEGGPAEGSRDTPGGRSDGDTDSVSTMWH 832 

N0V3b: 841 DVEEITGTQPPYGGGTMWQRTPEEERNLLHADSNGYTNLPDWQPSHSPTENSKGQSPP 900 
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DVEEITGTQPPYGGGTMVyQRTPEEERNLLHADSNGYTNLPDWQPSHSPTENSKGQSPP 
833 DVEEITGTQPPYGGGTMWQRTPEEERNLLHADSNGYTNLPDWQPSHSPTENSKGQSPP 892 

901 SKDGSGDYQSRGLVKAPGKSSFTMFVDLGIYQPGGSGDSIPITALVGGEGTRLDQLQYDV 960 

SKDGSGDYQSRGLVKAPGKSSFTMFVDLGIYQPGGSGDSIPITALVGGEGTRLDQLQYDV 
893 SKDGSGDYQSRGLVKAPGKSSFTMFVDLGIYQPGGSGDSIPITALVGGEGTRLDQLQYDV 952 

961 RKGSWNVNPTNTRAHSETPEIRPCYKKRFNSEILCAALWGVNLLVGTENGLMLLDRSGQG 

RKGSWNVNPTNTRAHSETPEIRKYKKRFNSEILCAALWGVNLLVGTENGLMLLDRSGQG 
953 RKGSWNVNPTNTRAHSETPEIRKYKKRFNSEILCAALWGVNLLVGTENGLMLLDRSGQG 

1021 ECVYGLIGRRRFQQMDVLEGLNLLITISGKRNKLRVYYLSWLRNKILHNDPEVEKKQGWTT 

KVYGLIGRRRFQQMDVLEGLNLLITISGKRNKLRVYYLSWLRNKILHNDPEVEKKQGWTT 
1013 KVYGLIGRRRFQQMDVLEGLNLLITISGKRNKLRVYYLSWLRNKILHNDPEVEKKQGWTT 



NIK : 

N0V3b: 

NIK : 

N0V3b: 
1020 

NIK : 
1012 

N0V3b: 
1080 

NIK : 
1072 

NOV3b: 
1140 

NIK : 
1132 

N0V3b: 
1200 

NIK : 
1192 

N0V3b : 
1260 

NIK : 
1252 

NOV3b : 

NIK : 



1081 VGDMEGCGHYRWKYERIKFLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVDL 

VGDMEGCGHYRWKYERIKFLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVDL 
1073 VGDMEGCGHYRVVKYERIKFLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVDL 

1141 TVEEGQRLKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQITPHAIIFLPNTDGMEMLLC 

TVEEGQRLKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQITPHAIIFLPNTDGMEMLLC 
1133 TVEEGQRLKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQITPHAIIFLPNTDGMEMLLC 

1201 YEDEGVYVNTYGRIIKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVETGHLDGVFM 

YEDEGVYVNTYGRIIKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVETGHLDGVFM 
1193 YEDEGVYVNTYGRIIKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVETGHLDGVFM 

1261 HKRAQRLKFLCERNDKVFFASVRSGGSSQVYFMTLNRNCIMNW 1303 {SEQ ID NO: 65) 

HKRAQRLKFLCERN DKVFFAS VRSGG S SQVY FMTLNRNC I MNW 
1253 HKRAQRLKFLCERNDKVFFASVRSGGSSQVYFMTLNRNCIMNW 1295 (SEQ ID NO: 32) 



45 



Based on its relatedness to known members of the STE20 family of protein kinases, 
NOV3b provides new diagnostic and therapeutic compositions useful in the treatment of 
disorders associated with alterations in the expression of members of the STE20 family of 
protein kinases. Nucleic acids, polypeptides, antibodies, and other compositions of the present 
invention are useful in the treatment and diagnosis of a variety of diseases and pathologies, 
including, by way of nonlimiting example, those involving metabolic and endocrine disorders, 
cancer, bone disorders, and tissue/cell growth regulation disorders. 



50 NOV-3C 

A NOV-3C sequence according to the invention is a nucleic acid sequence encoding a 
polypeptide related to STE20 family of protein kinases. A disclosed NOV-3c nucleic acid and 
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its encoded polypeptide includes the sequences shown in Table 16. The disclosed nucleic acid 
(SEQ ID NO: 10) is 3822 nucleotides in length and contains an open reading frame (ORF) that 
begins with an ATG initiation codon at nucleotides 1-3 and ends with a TGA stop codon at 
nucleotides 3820-3822. The start and stop codons are shown in bold font. A respective ORF 
encodes a 1273 amino acid polypeptide (SEQ ID NO: 1 1). 

TABLE 16 

ATGGGCGACCCAGCCCCCGCCCGCAGCCTGGACGACATCGACCTGTCCGCCCTGCGGGACCCTGCTGGGATCTTTGAGCT 
TGTGGAGGTGGTCGGCAATGGAACCTACGGACAGGTGTACAAGGGTCGGCATGTCAAGACGGGGCAGCTGGCTGCCATCA 
AGGTCATGGATGTCACGGAGGACGAGGAGGAAGAGATCAAACAGGAGATCAACATGCTGAAAAAGTACTCTCACCACCGC 
AACATCGCCACCTACTACGGAGCCTTCATCAAGAAGAGCCCCCCGGGAAACGATGACCAGCTCTGGCTGGTGATGGAGTT 
CTGTGGTGCTGGTTCAGTGACTGACCTGGTAAAGAACACAAAAGGCAACGCCCTGAAGGAGGACTGTATCGCCTATATCT 
GCAGGGAGATCCTCAGGGGTCTGGCCCATCTCCATGCCCACAAGGTGATCCATCGAGACATCAAGGGGCAGAATGTGCTG 
CTGACAGAGAATGCTGAGGTCAAGCTAGTGGATTTTGGGGTGAGTGCTCAGCTGGACCGCACCGTGGGCAGACGGAACAC 
TTTCATTGGGACTCCCTACTGGATGGCTCCAGAGGTCATCGCCTGTGATGAGAACCCTGATGCCACCTATGATTACAGGA 
GTGATATTTGGTCTCTAGGAATCACAGCCATCGAGATGGCAGAGGGAGCCCCCCCTCTGTGTGACATGCACCCCATGCGA 
GCCCTCTTCCTCATTCCTCGGAACCCTCCGCCCAGGCTCAAGTCCAAGAAGTGGTCTAAGAAGTTCATTGACTTCATTGA 
CACATGTCTCATCAAGACTTACCTGAGCCGCCCACCCACGGAGCAGCTACTGAAGTTTCCCTTCATCCGGGACCAGCCCA 
CGGAGCGGCAGGTCCGCATCCAGCTTAAGGACCACATTGACCGATCCCGGAAGAAGCGGGGTGAGAAAGAGGAGACAGAA 
TATGAGTACAGCGGCAGCGAGGAGGAAGATGACAGCCATGGAGAGGAAGGAGAGCCAAGCTCCATCATGAACGTGCCTGG 
AGAGTCGACTCTACGCCGGGAGTTTCTCCGGCTCCAGCAGGAAAATAAGAGCAACTCAGAGGCTTTAAAACAGCAGCAGC 
AGCTGCAGCAGCAGCAGCAGCGAGACCCCGAGGCACACATCAAACACCTGCTGCACCAGCGGCAGCGGCGCATAGAGGAG 
CAGAAGGAGGAGCGGCGCCGCGTGGAGGAGCAACAGCGGCGGGAGCGGGAGCAGCGGAAGCTGCAGGAGAAGGAGCAGCA 
GCGGCGGCTGGAGGACATGCAGGCTCTGCGGCGGGAGGAGGAGCGGCGGCAGGCGGAGCGCGAGCAGGAATATATTCGTC 
ACAGGCTAGAGGAGGAGCAGCGACAGCTCGAGATCCTTCAGCAACAGCTGCTCCAGGAACAGGCCCTGCTGCTGGAATAC 
AAGCGGAAGCAGCTGGAGGAGCAGCGGCAGTCAGAACGTCTCCAGAGGCAGCTGCAGCAGGAGCATGCCTACCTCAAGTC 
CCTGCAGCAGCAGCAACAGCAGCAGCAGCTTCAGAAACAGCAGCAGCAGCAGCTCCTGCCTGGGGACAGGAAGCCCCTGT 
ACCATTATGGTCGGGGCATGAATCCCGCTGACAAACCAGCCTGGGCCCGAGAGGTAGTGGCACACCGGGTCCCACTGAAG 
CCATATGCAGCACCTGTACCCCGATCCCAGTCCCTGCAGGACCAGCCCACCCGAAACCTGGCTGCCTTCCCAGCCTCCCA 
TGACCCCGACCCTGCCATCCCCGCACCCACTGCCACGCCCAGTGCCCGAGGAGCTGTCATCCGCCAGAATTCAGACCCCA 
CCTCTGAAGGACCTGGCCCCAGCCCGAATCCCCCAGCCTGGGTCCGCCCAGATAACGAGGCCCCACCCAAGGTGCCTCAG 
AGGACCTCATCTATCGCCACTGCCCTTAACACCAGTGGGGCCGGAGGGTCCCGGCCAGCCCAGGCAGTCCGTGCCAGTAA 
CCCCGACCTCAGGAGGAGCGACCCTGGCTGGGAACGCTCGGACAGCGTCCTTCCAGCCTCTCACGGGCACCTCCCCCAGG 
CTGGCTCACTGGAGCGGAACCGCGTGGGAGTCTCCTCCAAACCGGACAGCTCCCCTGTGCTCTCCCCTGGGAATAAAGCC 
AAGCCCGACGACCACCGCTCACGGCCAGGCCGGCCCGCAAGCTATAAGCGAGCAATTGGTGAGGACTTTGTGTTGCTGAA 
AGAGCGGACTCTGGACGAGGCCCCTCGGCCTCCCAAGAAGGCCATGGACTACTCGTCGTCCAGCGAGGAGGTGGAAAGCA 
GTGAGGACGACGAGGAGGAAGGCGAAGGCGGGCCAGCAGAGGGGAGCAGAGATACCCCTGGGGGCCGCAGCGATGGGGAT 
ACAGACAGCGTCAGCACCATGGTGGTCCACGACGTCGAGGAGATCACCGGGACCCAGCCCCCATACGGGGGCGGCACCAT 
GGTGGTCCAGCGCACCCCTGAAGAGGAGCGGAACCTGCTGCATGCTGACAGCAATGGGTACACAAACCTGCCTGACGTGG 
TCCAGCCCAGCCACTCACCCACCGAGAACAGCAAAGGCCAAAGCCCACCCTCGAAGGATGGGAGTGGTGACTACCAGTCT 
CGTGGGCTGGTAAAGGCCCCTGGCAAGAGCTCGTTCACGATGTTTGTGGATCTAGGGATCTACCAGCCTGGAGGCAGTGG 
GGACAGCATCCCCATCACAGCCCTAGTGGGTGGAGAGGGCACTCGGCTCGACCAGCTGCAGTACGACGTGAGGAAGGGTT 
CTGTGGTCAACGTGAATCCCACCAACACCCGGGCCCACAGTGAGACCCCTGAGATCCGGAAGTACAAGAAGCGATTCAAC 
TCCGAGATCCTCTGTGCAGCCCTTTGGGGGGTCAACCTGCTGGTGGGCACGGAGAACGGGCTGATGTTGCTGGACCGAAG 

58 



wo 01/62928 PCT/USOl/06151 

TGGGCAGGGCAAGGTGTATGGACTCATTGGGCGGCGACGCTTCCAGCAGATGGATGTGCTGGAGGGGCTCAACCTGCTCA 
TCACCATCTCAGGGAAAAGGAACAAACTGCGGGTGTATTACCTGTCCTGGCTCCGGAACAAGATTCTGCACAATGACCCA 
GAAGTGGAGAAGAAGCAGGGCTGGACCACCGTGGGGGACATGGAGGGCTGCGGGCACTACCGTGTTGTGAAATACGAGCG 
GATTAAGTTCCTGGTCATCGCCCTCAAGAGCTCCGTGGAGGTGTATGCCTGGGCCCCCAAACCCTACCACAAATTCATGG 
5 CCTTCAAGTCCTTTGCCGACCTCCCCCACCGCCCTCTGCTGGTCGACCTGACAGTAGAGGAGGGGCAGCGGCTCAAGGTC 
ATCTATGGCTCCAGTGCTGGCTTCCATGCTGTGGATGTCGACTCGGGGAACAGCTATGACATCTACATCCCTGTGCACAT 
CCAGAGCCAGATCACGCCCCATGCCATCATCTTCCTCCCCAACACCGACGGCATGGAGATGCTGCTGTGCTACGAGGACG 
AGGGTGTCTACGTCAACACGTACGGGCGCATCATTAAGGATGTGGTGCTGCAGTGGGGGGAGATGCCTACTTCTGTGGCC 
TACATCTGCTCCAACCAGATAATGGGCTGGGGTGAGAAAGCCATTGAGATCCGCTCTGTGGAGACGGGCCACCTCGACGG 
10 GGTCTTCATGCACAAACGAGCTCAGAGGCTCAAGTTCCTGTGTGAGCGGAATGACAAGGTGTTTTTTGCCTCAGTCCGCT 
CTGGGGGCAGCAGCCAAGTTTACTTCATGACTCTGAACCGTAACTGCATCATGAACTGGTGA (SEQ ID NO: 10) 



MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTEDEEEEIKQEINMLKKYSHHR 
NIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNTKGNALKEDCIAYICREILRGIiAHLHAHKVIHRDIKGQNVL 

15 LTENAEVKLVDFGVSAQLDRTVGRRNTFIGTPYWtdAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCBMHPMR 
ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRIQLKDHIDRSRKKRGEKEETE 
YEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEE 
QKEERRRVEEQQRREREQRKLQEKEQQRRLEDMQALRREEERRQAEREQEYIRHRLEEEQRQLEILQQQLLQEQALLLEY 
KRKQLEEQRQSERLQRQLQQEHAYLKSLQQQQQQQQLQKQQQQQLLPGDRKPLYHYGRGMNPADKPAWAREWAHRVPLK 

20 PYAAPVPRSQSLQDQPTRNLAAFPASHDPDPAIPAPTATPSARGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQ 
RTSSIATALNTSGAGGSRPAQAVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKA 
KPDDHRSRPGRPASYKRAIGEDFVLLKERTLDEAPRPPKKAMDYSSSSEEVESSEDDEEEGEGGPAEGSRDTPGGRSDGD 
TDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLHADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQS 
RGLVKAPGKSSFTMFVDLGIYQPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFN 

25 

SEILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGKRNKLRVYYLSWLRNKILHNDP 
EVEKKQGWTTVGDMEGCGHYRWKYERIKFLVIAIjKSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKV 
lYGSSAGFHAVDVDSGNSYDIYIPVHIQSQiTPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVA 

yicsnqimgwgekaieirsvetghldgvfmhkraqrlkflcerndkvffasvrsggssqvyfmtlnrncimnw (SEQ 
ID NO: 11) 

30 

The disclosed NOV-3c nucleic acid sequence has homology (72% identity) to a mouse 
mRNA for a NIK protein (NDC) (GenBank Accession No: MMU88984), as shown in Table 
17. NIK proteins are a subgroup of the STE20 family of protein kinases. As indicated by the 
"Expect" value, the probability of this aUgnment occurring by chance alone is 9.1e-299. 

35 Moreover, the disclosed, encoded amino acid sequence has 1048 of 1332 amino acid residues 
(78%) identical to a human NIK-related protein (GenBank Accession No: BAA90753), shown 
in Table 18. Furthermore, the encoded amino acid sequence also has homology (79% identity) 
to a human GCK kinase (GenBank Accession No: BAA94838), another subgroup of the 
STE20 kinase family, as shown in Table 19. As indicated by the "Expect" value, the 

40 probability of these amino acid alignments occurring by chance alone are both 0.0, the lowest 
probability score. 
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TABLE 17 

Score = 3907 (586.2 bits). Expect = 9.1e-299, Sum P(2) = 9.1e-299 
Identities = 1297/1788 (72%), Positives = 1297/1788 (72%), Strand = Plus / 
Plus 

N0V3c: 4 GGCGACCCAGCC-CCCGC.CCGCAGCCTGGACGACATCGACCTGTCCGCCCTGCGGGACCC 62 

GGCGA C A C CCCGC AGCCTGG GACAT GACCTGTC CCCTGCGGGACCC 
NIK : 3 GGCGAACGACTCTCCCGCGAAGAGCCTGGTGGACATTGACCTGTCGTCCCTGCGGGACCC 62 

NOV3c : 63 TGCTGGGATCTTTGAGCTTGTGGAGGTGGTCGGCAATGGAACCTACGGACAGGTGTACAA 

122 

TGCTGGGAT TTTGAGCT GTGGA GTGGT GG AATGG ACCTA GGACA GT TA AA 
NIK : 63 TGCTGGGATTTTTGAGCTGGTGGAAGTGGTTGGAAATGGCACCTATGGACAAGTCTATAA 

122 

N0V3c : 123 GGGTCGGCATGTCAAGACGGGGCAGCTGGCTGCCA.TCAAGGTCATGGATGTCACGGAGGA 
182 

GGGTCG CATGT AA ACGG CA CTG C GCCATCAAGGT ATGGA GTCAC GAGGA 
NIK : 123 GGGTCGACATGTTAAAACGGT-CA-CTGCC-GCCATCAAGGTTATGGACGTCACCGAGGA 
179 

N0V3c : 183 CGAGGAGGAAGAGATCAAACAGGAGATCAACATGCTGAAAAAGTACTCTCACCACCGCAA 
242 

GA GAGGAAGA ATCA AC GGAGAT AA ATGCTGAA AAGTA TCTCA CA CG AA 
NIK : 180 TGAAGAGGAAGAAATCACACTGGAGATAAATATGCTGAAGAAGTATTCTCATCATCGAAA 
239 

N0V3c : 24 3 CATCGCCACCTACTACGGAGCCTTCATCAAGAAGAGCCCCCCGGGAAACGATGACCAGCT 
302 

AT GCCAC TACTA GG GC TTCAT AAGAAGAGCCC CC GGA A GATGACCA CT 
NIK : 240 TATTGCCACGTACTATGGTGCTTTCATTAAGAAGAGCCCTCCAGGACATGATGACCAACT 
299 

N0V3c : 303 CTGGCTGGTGATGGAGTTCTGTGGTGCTGGTTCAGTGACTGACCTGGTAAAGAACACAAA 
362 

CTGGCT GT ATGGAGTT TGTGG GCTGG TC T AC GACCT GT AAGAACAC AA 
NIK : 300 CTGGCTTGTTATGGAGTTTTGTGGGGCTGGGTCCATCACAGACCTTGTGAAGAACACCAA 
359 

N0V3c : 363 AGGCAACGCCCTGAAGGAGGACTGTATCGCCTATATCTGCAGGGAGATCCTCAGGGGTCT 

422 - - -- . ....... 

AGG AAC C CT AA GA GACTG AT GC TA ATCT CAGGGA ATCCTCAGGGG T 
NIK : 360 AGGGAACACTCTCAAAGAAGACTGGATTGCTTACATCTCCAGGGAAATCCTCAGGGGATT 
419 

N0V3c : 423 GGCCCATCTCCATGCCCACAAGGTGATCCATCGAGACATCAAGGGGCAGAATGTGCTGCT 
482 

GGC CATCTCCAT CAC A GT AT CA CGAGA ATCTU^GGG CA AATGTGCTGCT 
NIK ; 420 GGCACATCTCCATATTCACCACGTTATTCACCGAGATATCAAGGGCCAAAATGTGCTGCT 
479 

N0V3c : 483 GACAGAGAATGCTGAGGTCAAGCTAGTGGATTTTGGGGTGAGTGCTCAGCTGGACCGCAC 
542 

GAC GAGAATGCTGAGGT AA CT GT GATTTTGG GT AG GCTCAGCTGGAC G AC 
NIK : 480 GACCGAGAATGCTGAGGTGAAACTTGTTGATTTTGGTGTAAGCGCTCAGCTGGACAGGAC 
539 

N0V3c: 543 CGTGGG-CAGACGGAACACTTTCATTGGGACTCCCTACTGGATGGCTCCAGAGGTCATCG 
601 

GT GG C GA G AA AC TTCAT GG AC CCCTACTGGATGGCTCCAGAGGTCATCG 
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NIK : 54-0 GGTTGGACGGA-GAAATACGTTCATAGGCACACCCTACTGGATGGCTCCAGAGGTCATCG 
598 

N0V3c: 602 CCTGTGATGAGAACCCTGATGCCACCTATGATTACAGGAGTGATATTTGGTCTCTA-GGA 
660 

CCTGTGATGAGAACCC GA GCCAC TA GA TACAG AGTGA T TGGTC CT GG 
NIK : '599 CCTGTGATGAGAACCCAGACGCCACTTACGACTACAGAAGTGACCTCTGGTC-CTGTGGC 
657 

N0V3C : 661 ATCACAGCCATCGAGATGGCAGAGGGAGCCCCCCCTCTGTGTGACATGCACCCCATGCGA 
720 

ATCACAGCCATCGAGATGGC GA GG G CCCCCCTCT TGTGACATGCA CC ATG GA 
NIK : 658 ATCACAGCCATCGAGATGGCTGAAGGGGGCCCCCCTCTCTGTGACATGCATCCAATGAGA 
717 

N0V3C : 721 GCCCTCTTCCTCATTCCTCGGAACCCTCCGCCCAGGCTCAAGTCCAAGAAGTGGTCTAAG 
780 

GC CT TT CTCAT CC G AACCCTCC CCCAGGCT AAGTC AA AA TGGTC AAG 
NIK : 718 GCGCTGTTTCTCATCCCCAGAAACCCTCCTCCCAGGCTGAAGTCAAAAAAATGGTCAAAG 
777 

N0V3C : 7 81 AAGTTCATTGA-CTTCATTGACACATGTCTCATCAAGACTTACCTG-AGCCGCCCACCCA 
838 

AA TT TT A CTT AT GA TGTCT T AAGA TTAC TG AGO GCCC C A 
NIK : 778 AAATTT-TTCAGCTTTATAGAAGGCTGTCTGGTGAAGAATTACATGCAGCGGCCCTCT-A 
835 

NOV3 c : 839 CGGAGC AGCT ACTGAAGTTTCCCTTCATCCGGGACCAGCCCACGGAGCGGCAGGTCCGCA 
898 

C GAGCA CT T AA CC TTCAT GGGA CAGCCCA GA GGCAGGT CG A 
NIK : 836 CAGAGCAACTTTTAAAACACCCTTTCATAAGGGATCAGCCCAATGAAAGGCAGGTTCGAA 
895 

NOV3 c : 899 TCCAGCTTAAGGACCACATTGACCGATCCCGGAAGAAGCGGGGTGAGAAAGAGGAGACAG 
958 

TCCAGCTTAAGGA CACAT GACCG CC G AAGAAG G GG GAGAAAGA GAGAC G 
NIK : 896 TCCAGCTTAAGGATCACATAGACCGGACCAGAAAGAAGAGAGGCGAGAAAGATGAGACGG 
955 

N0V3C : 959 AATATGAGTACAGCGGCAGCGAGGAGGAAGATGAC-A~GC-CATGGAG-AGGAAGGAGAG 
1014 

A TA GAGTACAGCGG AGCGAGGAGGA GA GA AG C TG AG AGGA GGAGAG 
NIK : 956 AGTACGAGTACAGCGGGAGCGAGGAGGAGGAGGAGGi^GTGCCTG-AGCAGGAGGGAGAG 
1014 

N0V3C : 1015 CCAAGCTCCATCATGAACGTGCCTGGAGAGTCGACTCTACGCCGGGAGTTTCTCCGGCTC 
1074 

CCAAG TCCATC T AA GTGCCTGGAGAGTC ACTCT CG CG GA TT CT G CT 
NIK : 1015 CCAAGTTCCATCGTCAATGTGCCTGGAGAGTCAACTCTGCGACGTGATTTCCTGAGACTG 
1074 

NOV3C : 1075 CAGCAGGAAAATAAG-AGCAACTCAGAGGCTTTAAAACAG-CAGCAGCAGCTGCAGCAGC 
1132 

CAGCAGGA AA AAG AGC TC GAGGCT T AG CAGCAGC CTGCAG AGC 

NIK : 1075 CAGCAGGAGAACAAGGAGCGG-TCTGAGGCTCTGCGG-AGACAGCAGCTTCTGCAGGAGC 
1132 



NOV3c: • 1133 AGCAGCAGCGAGACCCCGAGGCACACATCAAACACCTGCTGCACCAGCGGCAGCGGCGCA 
1192 

AGCAGC CG GA C GAGG AAA CA CTGCTG AG GGCAG GCG A 
NIK : 1133 AGCAGCTCCGGGAGCAGGAGGAGTATAAGAGGCAGCTGCTGGCTGAGAGGCAGT^GCGGA 
1192 
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1193 TAGAGGAGCAGAAGGAGGAGCGGCGCCGCGTGGAGGAGCAACAGCGGCGGGAGCGGGA-G 
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N0V3c: 
1251 

NIK : 
1252 

NOVBc: 
1307 

NIK : 
1310 

N0V3c: 
1365 

NIK : 
1366 

N0V3c: 
1423 

NIK : 
1423 

N0V3c : 
1479 

NIK : 
1482 

N0V3c: 
1538 

NIK : 
1537 

N0V3c: 
1590 

NIK : 
1595 

N0V3C : 
1647 

NIK : 
1654 

NOV3c : 
1704 

NIK : 
1710 

N0V3c: 
1764 

NIK : 
1764 

N0V3c: 

NIK : 



T GA AGCAGAA GA AG GG G CG TGGA GAGCAACA G G GA CGGGA G 
1193 TTGAACAGCAGAAAGAACAGAGGAGGCGGCTGGAAGAGCAACAAAGAAGAGAACGGGAAG 



1252 CAGCGGAAGCTGCAGGAGAAGGAGCAGCAGCGGCG-G — CTGGAGGACATGCAGGC-TCT 

C GGA GC GCAGGAG GAGCAGC GCGGCG G C GAGGA A G AGGC TCT 

1253 CCA-GGAGGCAGCAGGAGCGTGAGCAGCGGCGGCGTGAACAAGAGGAGAAG-AGGCGTCT 



1308 GCGGCGGGA — GGAGGAGCGGCGGCAGGCGGAGCGCGAGCAGGAATATATTCGTCACAGG 

CG GG A GGA GCGGCG A G GAG GAG AGGA A C A AGG 
1311 -CGA-GG AACTGGAAAGGCGGCGTAAAGAAGAGGAAGAG-AGGAG- ACGGGCAGAAGAGG 



1366 CTA-GAGGAG-GAGCAGCGACAGCTCGAGATCCTTCAGCAACAGCTGCTCCAGGAACAGG 

A GAGGAG G G AG G AC GAG T C TCAG C GC GCT AGGA AG 

1367 AGAAGAGGAGAGTGGAGAGGGAACAGGAG-TACATCAGG — CGGCAGCTAGAGGAGGAGC 



14 24 CCCTGCTGCTGGA-ATACA — AGCGGAAGCAGCTGGAGGAGCAGCGGCA-GTCAGAACGT 

C GC CTGGA AT C AGC G AGC GCT AGGAGCAG G CA GT A C 
14 24 AGCGGCACCTGGAGATCCTGCAGCAGCAGCTGCTCCAGGAGCAG-GCCATGTTACTGCAC 



1480 CTCCAGAGGCAGCTGCA-GCAGGAGCATGCCTACCTCAAGTCCCTGCAGCAGCAGCAACA 

CCA AGG GC GCA GCA AGCA GC CC C G CCC GCAGCAGCAG A CA 
1483 GACCACAGGAGGCCGCACGCAC-AGCA-GCAG-CCGCC-GCCCCCGCAGCAGCAGGA-CA 



1539 GCAGCAG — C-AGCTT-CA-GAAACAGCAGCAGCAGCAGCTCC-TG-CC-TGGGGACAGG 

G AGCA C AGCTT CA G CAG AGC AGC C C TG CC TG GACAG 
1538 GGAGCAAACCGAGCTTTCATGCTCCAG-AGCCCAAGCCTCACTATGACCCTGCTGACAG- 



1591 AAGCCCCTGTACCATTATGGTCGGGGCATGAATCCCGCT-GA-CAAAC-CAGCCTGGGCC 

AGC C G A TGGTC C G ATC C C GA CAA C CC G C 
1596 -AGCTCGGGAGGTACAGTGGTCCCACCTGGCATCTCTCAAGAACAATGTCTCCCCTGTCT 



1648 CG AGAGGTAGTGGC ACACCGGGTCCCACTGAAGCCATAT — GCAGCACCTGTACC-CCGA 

CGAGA T C C G G CCC T CCA AT GCA CACC A C CCG 

1655 CGAGATCCCATTCCTT-CAGTGACCCT-TCTC-CCAAATTCGCA-CACCACCATCTCCGC 

1705 TCCCAGTCCCTGCAGGACCAGCCCACCCGAAACCTGGCTGCCTTCCCAGCCTCCCATGAC 

TC CAG CC CA G CCA CC CCCG A GG GC CAG C C TGAC 

1711 TCTCAGGACC — CATGTCCA-CCTTCCCGCAGTGAGGG-GCTCAGTCAGAG-CTC-TGAC 



1765 CCCGACCCTGCCATCCCCGCACCCAC 1790 (SEQ ID NO: 66) 

C A C G T CCCG CCCAC 
1765 TCTAAGTCGGAGGTGCCCGAGCCCAC 1790 (SEQ ID NO: 33) 
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TABLE 18 

Score = 1985 bits (5143), Expect =0.0 

Identities = 1048/1332 (78%), Positives = 1051/1332 (78%), Gaps = 96/1332 
(7%) 

N0V3c: 1 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTX 60 

MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTyGQVYKGRHVKTGQLAAIKVMDVT 
NIK : 1 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVECTGQLAAIKVMDVTE 60 

N0V3c : 61 XXXXXIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 

IKQEINMLKKYSHHRNIATYYGAFIBCKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 
NIK : 61 DEEEEIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 

15 NdV3c: 121 KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 

KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 
NIK : 121 . KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 

N0V3c: 181 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 24 0 
20 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 

NIK : 181 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 240 

N0V3c: 241 ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 300 
ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 
25 NIK : 241 ALFLIPRNPPPRLKSKPCWSKKFIDFIDTCL.IKTYLSRPPTEQLLKFPFIRDQPTERQVRI 300 



30 



N0V3c: 301 QLKDHIXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXPSSIMNVPGESTLRREFLRLQQ 360 

QLKDHI PSSIMNVPGESTLRREFLRLQQ 
NIK : 301 QLKDHIDRSRKKRGEKEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQ 360 

N0V3c : 361 ENKSNSEALKXXXXXXXXXXRDPEAHIKHLLHXXXXXXXXXXXXXXXXXXXXXXXXXXXX 420 

ENKSNSEALK RDPEAHIKHLLH 
NIK : 361 ENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEEQKEERRRVEEQQRREREQRK 420 

35 N0V3c: 421 XXXXXXXXXXXDMQALXXXXXXXXXXXXXXYIRHRXXXXXXXXXXXXXXXXXXXXXXXXY 4 80 

DMQAL Y 
NIK : 421 LQEKEQQRRLEDMQAL RREEERRQAEREQEY 451 

N0V3c: 481 KRKXXXXXXXXXXXXXXXXXXHAYLKSXXXXXXXXXXXXXXXXXXXPGDRKPLYHYGRGM 540 
40 KRK HAYLKS PGDRKPLYHYGRGM 

NIK : 452 KRKQLEEQRQSERLQRQLQQEHAYLKSLQQQQQQQQLQKQQQQQLLPGDRKPLYHYGRGM 511 

N0V3c: 541 NPADKPAWAREVVAH RVP 558 

NPADKPAWAREV + p 

45 NIK : 512 NPADKPAWAREVEERTRMNKQQNSPLAKSKPGSTGPEPPIPQASPGPPGPLSQTPPMQRP 571 



50 



NOV3c: 559 LKPYAAP VPRSQSLQDQPTRNLAAFPASHXXXXXXXXXXXXXX 601 

++P P VPRSQSLQDQPTRNLAAFPASH 
NIK : 572 VEPQEGPHKSLVAHRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASHDPDPAIPAPTATPS 631 

NOV3c : 602 XRGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 661 

RGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 
NIK : 632 ARGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 691 

55 NOV3c: 662 AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 721 

AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 
NIK : 692 AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 751 

NOV3c: 722 PDDHRSRPGRPASYKRAIGEDFVLLKERTLDEAPRPPKKAMDYXXXXXXXXXXXXXXXXX 781 
60 PDDHRSRPGRPA DFVLLKERTLDEAPRPPKKAMDY 

NIK : 752 PDDHRSRPGRPA DFVLLKERTLDEAPRPPKKAMDYSSSSEEVESSEDDEEEG 803 
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N0V3c: 782 XXXXXXXXRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 841 

RDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 
NIK : 804 EGGPAEGSRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 863 

5 

NOVBc: 842 ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 901 

ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 
NIK : 864 ADSNGYTNLPDVVQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 923 

10 N0V3c: 902 QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 961 
QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 
NIK : 924 QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 983 

NOVSc: 962 EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGKRN 
15 1021 

EILCAAIiWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGKRN 
NIK : 984 EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGBCRN 
1043 

20 N0V3c: 1022 KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRWKYERIKFLVIALKSSVEV 

1081 ^ 
KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRWKYERIKFLVIALKSSVEV 

NIK : 104 4 KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRWKYERIKFLVIALKSSVEV 
1103 

25 

N0V3C : 1082 YAWAPKPYHKFMAFKSFMLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
1141 

YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
NIK : 1104 YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
30 1163 

N0V3C : 1142 YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDVVLQWGEMPTSVAY 
1201 

YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 
35 NIK : 1164 YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 
1223 

N0V3c : 1202 ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 
1261 

40 ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 
NIK : 1224 ICSNQIMGWGEKAI E I RS VETGHLDG VFMHKRAQRLKFLCERNDKVFFAS VRSGGS SQVY 
1283 

N0V3c: 12 62 FMTLNRNCIMNW 1273 (SEQ ID NO: 67) 
45 FMTLNRNCIMNW 

NIK : 1284 FMTLNRNCIMNW 1295 (SEQ IS NO: 34) 

TABLE 19 

Score - 2007 bits (5201), Expect = 0.0 
50 Identities = 1056/1332 (79%), Positives = 1059/1332 (79%), Gaps = 88/1332 
(6%) 

N0V3C : 1 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTX 60 
MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVT 
55 GCK : 1 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTE 60 

N0V3c: 61 XXXXXIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 

IKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 
GCK : 61 DEEEEIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 
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N0V3c: 121 KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 
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KGNALKEDCIAYICREILRGLAHLHAHICVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 
GCK : 121 KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 

N0V3c: 181 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 240 
5 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 

GCK : 181 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 240 

N0V3c: 241 ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 300 
ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 
10 GCK : 241 AliFLIPRNPPPRLKSKECWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 300 

N0V3c: 301 QLKDHIXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXPSSIMNVPGESTLRREFLRLQQ 360 

QLKDHI PSSIMNVPGESTLRREFLRLQQ 
GCK : 301 QLKDHIDRSRKKRGEKEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQ 360 

N0V3c: 361 ENKSNSEALKXXXXXXXXXXRDPEAHIKHLLHXXXXXXXXXXXXXXXXXXXXXXXXXXXX 420 

ENKSNSEALK RDPEAHIKHLLH 
GCK : 361 ENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEEQKEERRRVEEQQRREREQRK 420 

20 N0V3c: 4 21 XXXXXXXXXXXDMQALXXXXXXXXXXXXXXYIRHRXXXXXXXXXXXXXXXXXXXXXXXXY 4 80 

DMQAL -'Y 
GCK : 421 LQEKEQQRRLE DMQAL RREEERRQAEREQEY 451 

N0V3c: 481 KRKXXXXXXXXXXXXXXXXXXHAYLKSXXXXXXXXXXXXXXXXXXXPGDRKPLYHYGRGM 540 
25 KRK HAYLKS PGDRKPLYHYGRGM 

GCK : 452 KRKQLEEQRQSERLQRQLQQEHAYLKSLQQQQQQQQLQKQQQQQLLPGDRKPLYHYGRGM 511 

N0V3c: 541 NPADKP AWARE VVAH rVP 558 

NPADKPAWAREV + p 

30 GCK : 512 NPADKPAWAREVEERTRMNKQQNSPLAKSKPGSTGPEPPIPQASPGPPGPLSQTPPMQRP 571 

N0V3c: 559 LKPYAAP VPRSQSLQDQPTRNLAAFPASHXXXXXXXXXXXXXX 601 

++P P VPRSQSLQDQPTRNLAAFPASH 
GCK : 572 VEPQEGPHKSLVAHRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASHDPDPAIPAPTATPS 631 

N0V3c: 602 XRGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 661 

. RGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 
GCK : 632 ARGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 691 

40 N0V3c: 662 AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 721 

AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 
GCK : 692 AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 751 

N0V3c: 722 PDDHRSRPGRPASYKRAIGEDFVLLKERTLDEAPRPPKKAMDYXXXXXXXXXXXXXXXXX 781 

PDDHRSRPGRPASYKRAIGEDFVLLKERTLDEAPRPPKKAMDY 
GCK : 752 PDDHRSRPGRPASYKRAIGEDFVLLKERTLDEAPRPPKKAMDYSSSSEEVESSEDDEEEG 811 

N0V3c: 782 XXXXXXXXRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 841 

RDTPGGRSDGDTDSVSTMVVHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 
50 GCK : 812 EGGPAEGSRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 871 

N0V3c: 84 2 ADSNGYTNLPDVVQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 901 

ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 
GCK : 872 ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 931 

N0V3C : 902 QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSVVNVNPTNTRAHSETPEIRKYKKRFNS 961 

QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 
GCK : 932 QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 991 

60 N0V3c: 962 EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGKRN 
1021 

EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGKRN 
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GCK 
1051 



N0V3c: 1022 
1081 



GCK 
1111 



GCK 
1171 



GCK 
1231 



GCK 
1291 



1052 



NOVBc: 1082 
1141 
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E I LCAALWGVNLLVGTENGLMLLDRSGQGKVYGLI GRRRFQQMDVLEGLNLLIT I SGKRN 

KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRWKYERIKFLVIALKSSVEV 

KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRWKYERIKFLVIALKSSVEV 
KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRWKYERIKFLVIALKSSVEV 

YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 



YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
1112 YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 



N0V3c: 1142 
1201 



1172 



NOV3c: 1202 
1261 



1232 



NOV3c: 1262 
GCK : 1292 



YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 

YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDVVLQWGEMPTSVAY 
YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 



ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 

ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 
ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 



FMTLNRNCIMNW 1273 (SEQ ID No: 68) 
FMTLNRNCIMNW 

FMTLNRNCIMNW 1303 (SEQ ID NO: 35) 



Based on its relatedness to known members of the STE20 family of protein kinases, 
NOV3b provides new diagnostic and therapeutic compositions useful in the treatment of 
35 disorders associated with alterations in the expression of members of the STE20 family of 

protein kinases. Nucleic acids, polypeptides, antibodies, and other compositions of the present 
invention are useful in the treatment and diagnosis of a variety of diseases and pathologies, 
including, by way of nonlimiting example, those involving metabolic and endocrine disorders, 
cancer, bone disorders, and tissue/cell growth regulation disorders. 

40 

NOV-3d 

A NOV-3d sequence according to the invention is a nucleic acid sequence encoding a 
polypeptide related to STE20 family of protein kinases. A disclosed NOV-3d nucleic acid and 
its encoded polypeptide includes the sequences shown in Table 20. The disclosed nucleic acid 
45 (SEQ ID NO: 12) is 3735 nucleotides in length and contains an open reading frame (ORF) that 
begins with an ATG initiation codon at nucleotides 1-3 and ends with a TGA stop codon at 
nucleotides 3733-3735. The start and stop codons are shown in bold font. The disclosed, 
respective ORF encodes a 1244 amino acid polypeptide (SEQ ID NO: 13). 
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TABLE 20 

ATGGGCGACCCAGCCCCCGCCCGCAGCCTGGACGACATCGACCTGTCCGCCCTGCGGGACCCTGCTGGGATCTTTGAGCT 
TGTGGAGGTGGTCGGCAATGGAACCTACGGACAGGTGTACAAGGGTCGGCATGTCAAGACGGGGCAGCTGGCTGCCATCA 
AGGTCATGGATGTCACGGAGGACGAGGAGGAAGAGATCAAACAGGAGATCAACATGCTGAAAAAGTACTCTCACCACCGC 
AACATCGCCACCTACTACGGAGCCTTCATCAAGAAGAGCCCCCCGGGAAACGATGACCAGCTCTGGCTGGTGATGGAGTT 
CTGTGGTGCTGGTTCAGTGACTGACCTGGTAAAGAACACAAAAGGCAACGCCCTGAAGGAGGACTGTATCGCCTATATCT 
GCAGGGAGATCCTCAGGGGTCTGGCCCATCTCCATGCCCACAAGGTGATCCATCGAGACATCAAGGGGCAGAATGTGCTG 
CTGACAGAGAATGCTGAGGTCAAGCTAGTGGATTTTGGGGTGAGTGCTCAGCTGGACCGCACCGTGGGCAGACGGAACAC 
TTTCATTGGGACTCCCTACTGGATGGCTCCAGAGGTCATCGCCTGTGATGAGAACCCTGATGCCACCTATGATTACAGGA 
GTGATATTTGGTCTCTAGGAATCACAGCCATCGAGATGGCAGAGGGAGCCCCCCCTCTGTGTGACATGCACCCCATGCGA 
GCCCTCTTCCTCATTCCTCGGAACCCTCCGCCCAGGCTCAAGTCCAAGAAGTGGTCTAAGAAGTTCATTGACTTCATTGA 
CACATGTCTCATCAAGACTTACCTGAGCCGCCCACCCACGGAGCAGCTACTGAAGTTTCCCTTCATCCGGGACCAGCCCA 
CGGAGCGGCAGGTCCGCATCCAGCTTAAGGACCACATTGACCGATCCCGGAAGAAGCGGGGTGAGAAAGAGGAGACAGAA 
TATGAGTACAGCGGCAGCGAGGAGGAAGATGACAGCCATGGAGAGGAAGGAGAGCCAAGCTCCATCATGAACGTSCCTGG 
AGAGTCGACTCTACGCCGGGAGTTTCTCCGGCTCCAGCAGGAAAATAAGAGCAACTCAGAGGCTTTAAAACAGCAGCAGC 
AGCTGCAGCAGCAGCAGCAGCGAGACCCCGAGGCACACATCAAACACCTGCTGCACCAGCGGCAGCGGCGCATAGAGGAG 
CAGAAGGAGGAGCGGCGCCGCGTGGAGGAGCAACAGCGGCGGGAGCGGGAGCAGCGGAAGCTGCAGGAGAAGGAGCAGCA 
GCGGCGGCTGGAGGACATGCAGGCTCTGCGGCGGGAGGAGGAGCGGCGGCAGGCGGAGCGCGAGCAGGAATATATTCGTC 
ACAGGCTAGAGGAGCAGCGGCAGTCAGAACGTCTCCAGAGGCAGCTGCAGCAGGAGCATGCCTACCTCAAGTCCCTGCAG 
CAGCAGCAACAGCAGCAGCAGCTTCAGAAACAGCAGCAGCAGCAGCTCCTGCCTGGGGACAGGAAGCCCCTGTACCATTA 
TGGTCGGGGCATGAATCCCGCTGACAAACCAGCCTGGGCCCGAGAGGTAGTGGCACACCGGGTCCCACTGAAGCCATATG 
CAGCACCTGTACCCCGATCCCAGTCCCTGCAGGACCAGCCCACCCGAAACCTGGCTGCCTTCCCAGCCTCCCATGACCCC 
GACCCTGCCATCCCCGCACCCACTGCCACGCCCAGTGCCCGAGGAGCTGTCATCCGCCAGAATTCAGACCCCACCTCTGA 
AGGACCTGGCCCCAGCCCGAATCCCCCAGCCTGGGTCCGCCCAGATAACGAGGCCCCACCCAAGGTGCCTCAGAGGACCT 
CATCTATCGCCACTGCCCTTAACACCAGTGGGGCCGGAGGGTCCCGGCCAGCCCAGGCAGTCCGTGCCAGTAACCCCGAC 
CTCAGGAGGAGCGACCCTGGCTGGGAACGCTCGGACAGCGTCCTTCCAGCCTCTCACGGGCACCTCCCCCAGGCTGGCTC 
ACTGGAGCGGAACCGCGTGGGAGTCTCCTCCAAACCGGACAGCTCCCCTGTGCTCTCCCCTGGGAATAAAGCCAAGCCCG 
ACGACCACCGCTCACGGCCAGGCCGGCCCGCAAGCTATAAGCGAGCAATTGGTGAGGACTTTGTGTTGCTGAAAGAGCGG 
ACTCTGGACGAGGCCCCTCGGCCTCCCAAGAAGGCCATGGACTACTCGTCGTCCAGCGAGGAGGTGGAAAGCAGTGAGGA 
CGACGAGGAGGAAGGCGAAGGCGGGCCAGCAGAGGGGAGCAGAGATACCCCTGGGGGCCGCAGCGATGGGGATACAGACA 
GCGTCAGCACCATGGTGGTCCACGACGTCGAGGAGATCACCGGGACCCAGCCCCCATACGGGGGCGGCACCATGGTGGTC 
CAGCGCACCCCTGAAGAGGAGCGGAACCTGCTGCATGCTGACAGCAATGGGTACACAAACCTGCCTGACGTGGTCCAGCC 
CAGCCACTCACCCACCGAGAACAGCAAAGGCCAAAGCCCACCCTCGAAGGATGGGAGTGGTGACTACCAGTCTCGTGGGC 
TGGTAAAGGCCCCTGGCAAGAGCTCGTTCACGATGTTTGTGGATCTAGGGATCTACCAGCCTGGAGGCAGTGGGGACAGC 
ATCCCCATCACAGCCCTAGTGGGTGGAGAGGGCACTCGGCTCGACCAGCTGCAGTACGACGTGAGGAAGGGTTCTGTGGT 
CAACGTGAATCCCACCAACACCCGGGCCCACAGTGAGACCCCTGAGATCCGGAAGTACAAGAAGCGATTCAACTCCGAGA 
TCCTCTGTGCAGCCCTTTGGGGGGTCAACCTGCTGGTGGGCACGGAGAACGGGCTGATGTTGCTGGACCGAAGTGGGCAG 
GGCAAGGTGTATGGACTCATTGGGCGGCGACGCTTCCAGCAGATGGATGTGCTGGAGGGGCTCAACCTGCTCATCACCAT 
CTCAGGGAAAAGGAACAAACTGCGGGTGTATTACCTGTCCTGGCTCCGGAACAAGATTCTGCACAATGACCCAGAAGTGG 
AGAAGAAGCAGGGCTGGACCACCGTGGGGGACATGGAGGGCTGCGGGCACTACCGTGTTGTGAAATACGAGCGGATTAAG 
TTCCTGGTCATCGCCCTCAAGAGCTCCGTGGAGGTGTATGCCTGGGCCCCCAAACCCTACCACAAATTCATGGCCTTCAA 
GTCCTTTGCCGACCTCCCCCACCGCCCTCTGCTGGTCGACCTGACAGTAGAGGAGGGGCAGCGGCTCAAGGTCATCTATG 
GCTCCAGTGCTGGCTTCCATGCTGTGGATGTCGACTCGGGGAACAGCTATGACATCTACATCCCTGTGCACATCCAGAGC 
CAGATCACGCCCCATGCCATCATCTTCCTCCCCAACACCGACGGCATGGAGATGCTGCTGTGCTACGAGGACGAGGGTGT 
CTACGTCAACACGTACGGGCGCATCATTAAGGATGTGGTGCTGCAGTGGGGGGAGATGCCTACTTCTGTGGCCTACATCT 
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GCTCCAACCAGATAATGGGCTG6GGTGAGAAAGCCATTGAGATCCGCTCTGTGGAGACGGGCCACCTCGACGGGGTCTTC 
ATGCACAAACGAGCTCAGAGGCTCAAGTTCCTGTGTGAGCGGAATGACAAGGTGTTTTTTGCCTCAGTCCGCTCTGGGGG 
CAGCAGCCAAGTTTACTTCATGACTCTGAACCGTAACTGCATCATGAACTGGTGA (SEQ ID NO: 12) 

5 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTEDEEEEIKQEINMLKKYSHHR 
NIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNTKGNALKEDCIAYICREILRGIAHLHAHKVIHRDIKGQNVL 
LTENAEVKLVDFGVSAQLDRTVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPIiCDMHPMR 
ALFLIPRNPPPRIiKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRIQLKDHIDRSRKKRGEKEETE 
YEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEE 

10 QKEERRRVEEQQRREREQRKLQEKEQQRRLEDMQALRREEERRQAEREQEYIRHRLEEQRQSERLQRQLQQEHAYLKSLQ 
QQQQQQQLQKQQQQQLLPGDRKPLYHYGRGMNPADKPAWAREVVAHRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASHDP 
DPAIPAPTATPSARGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQAVRASNPD 
LRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRPASYKRAIGEDFVLLKER 
TLDEAPRPPKKAMDYSSSSEEVESSEDDEEEGEGGPAEGSRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMW 

15 QRTPEEERNLLHADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIYQPGGSGDS 
IPITALVGGEGTRLDQLQYDVRKGSVVNVNPTNTRAHSETPEIRKYKKRFNSEILCAALWGVNLLVGTENGLMLLDRSGQ 
GKVYGLIGRRRFQQMDVLEGLNLLITISGKRNKLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRVVKYERIK 
FLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQS 
QITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVETGHLDGVF 

20 MHKRAQRLKFLCERNDKVFFASVRSGGSSQVYFMTLNRNCIMNW (SEQ ID NO: 13) 

The disclosed NOV-3d nucleic acid sequence has homology (73% identity) to a mouse 
iiiRNA for a NIK protein (NIK) (GenBank Accession No: MMU88984), as shown in Table 
21 . NIK proteins are a subgroup of the STE20 family of protein kinases. As indicated by the 

25 "Expect" value, the probability of this alignment occurring by chance alone is 2.2e-295. 

Moreover, the disclosed, encoded amino acid sequence has 1046 of 1303 amino acid residues 
(80%) identical to a human NIK-related protein (GenBank Accession No: BAA90753), shown 
in Table 22. Furthermore, the disclosed, encoded amino acid sequence also has homology 
(80% identity) to a human GCK kinase (GenBank Accession No: B AA94838), another 

30 subgroup of the STE20 kinase family, as shown in Table 23. As indicated by the "Expect" 

value, the probability of these amino acid alignments occurring by chance alone are both 0.0, 
the lowest probability score. 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



Score = 


3832 


Identities = 


Plus 




NOV3d: 


4 


NIK : 


3 


N0V3d: 


63 


122 




NIK : 


63 


122 




N0V3d : 


123 


182 




NIK : 


123 


179 




NOVBd: 


183 


242 




NIK 


180 


239 




NOV3d: 


243 


302 




NIK : 


240 


299 




NOV3d: 


303 


362 




NIK : 


300 


359 




NOV3d: 


363 


422 




NIK : 


360 


419 




NOV3d: 


423 


482 




NIK : 


420 


479 




NOV3d: 


483 


542 




NIK : 


480 


539 




NOV3d: 


543 


601 




NIK : 


540 


598 





2.2e-295 



Plus / 



GGCGACCCAGCC-CCCGCCCGCAGCCTGGACGACATCGACCTGTCCGCCCTGCGGGACCC 62 
GGCGA C A C CCCGC AGCCTGG GACAT GACCTGTC CCCTGCGGGACCC 
GGCGAACGACTCTCCCGCGAAGAGCCTGGTGGACATTGACCTGTCGTCCCTGCGGGACCC 62 

63 TGCTGGGATCTTTGAGCTTGTGGAGGTGGTCGGCAATGGAACCTACGGACAGGTGTACAA 

TGCTGGGAT TTTGAGCT GTGGA GTGGT GG AATGG ACCTA GGACA GT TA AA 



123 GGGTCGGCATGTCAAGACGGGGCAGCTGGCTGCCATCAAGGTCATGGATGTCACGGAGGA 
GGGTCG CATGT AA ACGG CA CTG C GCCATCAAGGT ATGGA GTCAC GAGGA 



183 CGAGGAGGAAGAGATCAAACAGGAGATCAACATGCTGAAAAAGTACTCTCACCACCGCAA 
GA GAGGAAGA ATCA AC GGAGAT AA ATGCTGAA AAGTA TCTCA CA CG AA 



243 CATCGCCACCTACT ACGGAGCCTTCATCAAGAAGAGCCCCCCGGGAAACGATGACCAGCT 

AT GCCAC TACTA GG GC TTCAT AAGAAGAGCCC CC GGA A GATGACCA CT 
rATTGCCACGTACTATGGTGCTTTCATTAAGAAGAGCCCTCCAGGACATGATGACCAACT 

303 CTGGCTGGTGATGGAGTTCTGTGGTGCTGGTTCAGTGACTGACCTGGTAAAGAACACAAA 
CTGGCT GT ATGGAGTT TGTGG GCTGG TC T AC GACCT GT AAGAACAC AA 



363 AGGCAACGCCCTGAAGGAGGACTGTATCGCCTATATCTGCAGGGAGATCCTCAGGGGTCT 

4 

AGG AAC C CT AA GA GACTG AT GC TA ATCT CAGGGA ATCCTCAGGGG T 



423 GGCCCATCTCCATGCCCACAAGGTGATCCATCGAGACATCAAGGGGCAGAATGTGCTGCT 
GGC CATCTCCAT CAC A GT AT CA CGAGA ATCAAGGG CA AATGTGCTGCT 



4 83 GACAGAGAATGCTGAGGTCAAGCTAGTGGATTTTGGGGTGAGTGCTCAGCTGGACCGCAC 

GAC GAGAATGCTGAGGT AA CT GT GATTTTGG GT AG GCTCAGCTGGAC G AC 
GACCGAGAATGCTGAGGTGAAACTTGTTGATTTTGGTGTAAGCGCTCAGCTGGACAGGAC 

543 CGTGGG-CAGACGGAACACTTTCATTGGGACTCCCTACTGGATGGCTCCAGAGGTCATCG 
GT GG C GA G AA AC TTCAT GG AC CCCTACTGGATGGCTCCAGAGGTCATCG 
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15 



20 
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30 
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N0V3d: 
660 

NIK ' : 
657 

N0V3d: 
720 

NIK : 
717 

N0V3d: 
780 

NIK : 
777 

N0V3d : 
838 

NIK : 
835 

N0V3d: 
898 

NIK : 
895 

N0V3d: 
958 

NIK : 
955 

N0V3d: 
1014 

NIK : 
1014 

N0V3d: 
1074 

NIK : 
1074 

N0V3d: 
1132 

NIK : 
1132 

N0V3d: 
1192 

NIK : 
1192 

N0V3d: 
1251 



602 CCTGTGATGAGAACCCTGATGCCACCTATGATTACAGGAGTGATATTTGGTCTCTA-GGA 

CCTGTGATGAGAACCC GA GCCAC TA GA TACAG AGTGA T TGGTC CT GG 
599 CCTGTGATGAGAACCCAGACGCCACTTACGACTACAGAAGTGACCTCTGGTC-CTGTGGC 



661 ATCACAGCCATCGAGATGGCAGAGGGAGCCCCCCCTCTGTGTGACATGCACCCCATGCGA 

ATCACAGCCATCGAGATGGC GA GG G CCCCCCTCT TGTGACATGCA CO ATG GA 
658 ATCACAGCCATCGAGATGGCTGAAGGGGGCCCCCCTCTCTGTGACATGCATCCAATGAGA 



721 GCCCTCTTCCTCATTCCTCGGAACCCTCCGCCCAGGCTCAAGTCCAAGAAGTGGTCTAAG 

GC CT TT CTCAT CC G AACCCTCC CCCAGGCT AAGTC AA AA TGGTC AAG 
718 GCGCTGTTTCTCATCCCCAGAAACCCTCCTCCCAGGCTGAAGTCAAAAAAATGGTCAAAG 



781 AAGTTCATTGA-CTTCATTGACACATGTCTCATCAAGACTTACCTG-AGCCGCCCACCCA 

AA TT TT A CTT AT GA TGTCT T AAGA TTAC TG AGC GCCC C A 
778 AAATTT-TTCAGCTTTATAGAAGGCTGTCTGGTGAAGAATTACATGCAGCGGCCCTCT-A 



839 CGGAGCAGCTACTGAAGTTTCCCTTCATCCGGGACCAGCCCACGGAGCGGCAGGTCCGCA 

C GAGCA CT T AA CC TTCAT GGGA CAGCCCA GA GGCAGGT CG A 
836 CAGAGCAACTTTTAAAACACCCTTTCATAAGGGATCAGCCCAATGAAAGGCAGGTTCGAA 



899 TCCAGCTTAAGGACCACATTGACCGATCCCGGAAGJ^GCGGGGTGAGAAAGAGGAGACAG 

TCCAGCTTAAGGA CACAT GACCG CC G AAGAAG G GG GAGAAAGA GAGAC G 
8 96 TCCAGCTTAAGGATCACATAGACCGGACCAGA7VAGAAGAGAGGCGAGAAAGATGAGACGG 



959 AATATGAGTACAGCGGCAGCGAGGAGGAAGATGAC-rA-GC-CATGGAG-AGGAAGGAGAG 

A TA GAGTACAGCGG AGCGAGGAGGA GA GA AG C TG AG AGGA GGAGAG 
956 AGTACGAGTACAGCGGGAGCGAGGAGGAGGAGGAGGAAGTGCCTG-AGCAGGAGGGAGAG 



1015 CCAAGCTCCATCATGAACGTGCCTGGAGAGTCGACTCTACGCCGGGAGTTTCTCCGGCTC 

CCAAG TCCATC T AA GTGCCTGGAGAGTC ACTCT CG CG GA TT CT G CT 
1015 CCAAGTTCCATCGTCAATGTGCCTGGAGAGTCAACTCTGCGACGTGATTTCCTGAGACTG 



107 5 CAGCAGGAAAATAAG- AGCAACTCAGAGGCTTTAAAACAG-CAGCAGCAGCTGCAGCAGC 

CAGCAGGA AA AAG AGC TC GAGGCT T AG CAGCAGC CTGCAG AGC 

1075 CAGCAGGAGAACAAGGAGCGG-TCTGAGGCTCTGCGG-AGACAGCAGCTTCTGCAGGAGC 



1133 AGCAGCAGCGAGACCCCGAGGCACACATCAAACACCTGCTGCACCAGCGGCAGCGGCGCA 

AGCAGC CG GA C GAGG A A A CA CTGCTG AG GGCAG GCG A 
1133 AGCAGCTCCGGGAGCAGGAGGAGT ATAAGAGGCAGCTGCTGGCTGAGAGGCAGAAGCGGA 



1193 TAGAGGAGCAGAAGGAGGAGCGGCGCCGCGTGGAGGAGCAACAGCGGCGGGAGCGGGA-G 
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T GA AGCAGAA GA AG GG G CG TGGA GAGCAACA G G GA CGGGA G 
NIK : 1193 TTGAACAGCAGAAAGAACAGAGGAGGCGGCTGGAAGAGCAACAAAGAAGAGAACGGGAAG 
1252 



NOV3d : 1252 CAGCGGAAGCTGCAGGAGAAGGAGCAGCAGCGGCG-G — CTGGAGGACATGCAGGC-TCT 
1307 

C GGA GC GCAGGAG GAGCAGC GCGGCG G C GAGGA A G AGGC TCT 
NIK : 1253 CCA-GGAGGCAGCAGGAGCGTGAGCAGCGGCGGCGTGAACAAGAGGAGAAG-AGGCGTCT 
1310 

N0V3d : 1308 GCGGCGGGA — GGAGGAGCGGCGGCAGGCGGAGCGCGAGCAGGAATATATTCGTCACAGG 
1365 

CG GG A GGA GCGGCG A G . GAG GAG AGGA A C A AGG 
NIK : 1311 -CGA-GGAACTGGAAAGGCGGCGTAAAGAAGAGGAAGAG-AGGAG-ACGGGCAGAAGAGG 
1366 



NOV3d : 1366 CTA-GAGGAGCAGCGGC-AGT CAGAACGT-CTCCAGA-GGCAGCTGCAGCAGGAGCA 

1418 

A GAGGAG AG GG AG CAG A GT C CAG GGCAGCT AG AGGAGCA 
NIK : 1367 AGAAGAGGAG-AGTGGAGAGGGAACAGGA-GTACATCAGGCGGCAGCTAGAGGAGGAGCA 
1424 ^ 

N0V3d : 1419 T-GCCTACCTCAAGTCCCTGCAGCAGCAGCAACAGCAGCAGCAGCTTCA-GAAACAGCA- 
1475 

G C ACCT AG CCTGCAGCAGCAGC C CAG AGCAG CA G AC GCA 
NIK : 1425 GCGGC- ACCTGGAGATCCTGCAGCAGCAGCTGCTCCAGGAGCAGGC-CATGTTACTGCAC 
1482 



NOV3d : 1476 G-CAGCAGCAGCTCCTGCCTGGGGA-CAGGAAGCCCCTGTACCATTATGGTCGGGGCATG 

1533 

G C CAG AG CC GC G A CAG A GCC C G CC A G C G CA G 
NIK : 1483 GACCACAGGAGG-CC-GCACGCACAGCAGCA-GCCGCCGCCCCCGCA — G-CAG — CAGG 
1534 



N0V3d: 1534 AATCCCGCTGACAAACC-AGCCTGG — GCCCGAGAGGTAGTGGCACACCGGGTCCCA-CT 
1589 

A C G G CAAACC AGC T GC C AGAG G C CAC G CCC CT 

NIK : 1535 A — CAGGA-G-CAAACCGAGCTTTCATGCTCCAGAGCCCAAGCCTCACTATGACCCTGCT 
1590 



NOV3d : 1590 GA-AGCCATATGCAGC-ACC-^GTACCCCGATCCCAGTCCCTGCAGGACCAGCCCACCCG 
1646 

GA AG T G AG AC TG CCC T CA TC CT AG AC A C CCC 
NIK : 1591 GACAGAGCTCGGGAGGTACAGTGGTCCCACCTGGCA-TCTCTCAAGAACAATGTCTCCCC 
1649 



NOV3d : 1647 AAACCTG-GCTGCC-TTCC — CAGCCTCCCATGACCCCGACCCTGC-CATCCCCGCACCC 
1701 

C G G T CC TTCC CAG CCC T CCC A C GC CA C CC CC 
NIK : 1 650 TGTCTCGAGATCCCATTCCTTCAGTGACCCTTCTCCCAAATTC-GCACACCACCATCTCC 
1708 

NOV3d; 1702 ACTGCCACG-CCCAGTGCCC 1720 (SEQ ID NO: 69) 

CT CA G CCCA TG CC 
NIK : 1709 GCTCTCAGGACCCA-TGTCC 1727 (SEQ ID NO: 36) 
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TABLE 22 

Score = 1995 bits (5170), Expect = 0.0 

Identities = 1046/1303 (80%), Positives = 1049/1303 (80%), Gaps = 67/1303 
(5%) 

5 

N0V3d: 1 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTX 60 

MGDPAPARSLDDI DLSALRDPAGI FELVEVVGNGTYGQVYKGRHVKTGQLAAIKVMDVT 
NIK : 1 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIECVMDVTE 60 

10 N0V3d: 61 XXXXXIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 

IKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 
NIK : 61 DEEEEIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 

N0V3d: 121 KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 
15 KGNALKEDCIAYICREILRGLTUiLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 

NIK : 121 KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 

N0V3d: 181 TVGRRNTFIGTPY^7^LAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCD^1HPMR 240 
TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 
20 NIK : 181 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 240 

N0V3d: 241 ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 300 

ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 
NIK : 241 ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 300 

25 

' N0V3d: 301 QLKDHIXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXPSSIMNVPGESTLRREFLRLQQ 360 
QLKDHI . PSSIMNVPGESTLRREFLRLQQ 
NIK : 301 QLKDHIDRSRKKRGEKEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQ 360 

30 ^ N0V3d: 361 ENKSNSEALKXXXXXXXXXXRDPEAHIKHLLHXXXXXXXXXXXXXXXXXXXXXXXXXXXX 4 20 

ENKSNSEALK RDPEAHIKHLLH 
NIK : 361 ENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEEQKEERRRVEEQQRREREQRK 420 

N0V3d: 4 21 XXXXXXXXXXXDMQALXXXXXXXXXXXXXXYIRHXXXXXXXXXXXXXXXXXXHAYLKSXX 4 80 
35 DMQAL Y R HAYLKS 

NIK : 421 LQEKEQQRRLEDMQALRREEERRQAEREQEYKRKQLEEQRQSERLQRQLQQEHAYLKSLQ 480 

N0V3d: 481 XXXXXXXXXXXXXXXXXPGDRKPLYHYGRGMNPADKPAWAREWAH 526 

PGDRKPLYHYGRGMNPADKPAWAREV 
40 NIK : 4 81 QQQQQQQLQKQQQQQLLPGDRKPLYHYGRGMNPADKPAWAREVEERTRMNKQQNSPLAKS 54 0 

N0V3d: 527 RVPLKPYAAP VPRSQ 541 

+ P+-I-P P VPRSQ 
NIK • 541 KPGSTGPEPPIPQASPGPPGPLSQTPPMQRPVEPQEGPHKSLVAHRVPLKPYAAPVPRSQ 600 

45 

N0V3d: 542 SLQDQPTRNLAAFPASHXXXXXXXXXXXXXXXRGAVIRQNSDPTSEGPGPSPNPPAWVRP 601 

SLQDQPTRNLAAFPASH RGAVIRQNSDPTSEGPGPSPNPPAWVRP 
NIK : 601 SLQDQPTRNLAAFPASHDPDPAIPAPTATPSARGAVIRQNSDPTSEGPGPSPNPPAWVRP 660 

50 N0V3d: 602 DNEAPPKVPQRTSSIATALNTSGAGGSRPAQAVRASNPDLRRSDPGWERSDSVLPASHGH 661 

DNEAPPKVPQRTSSIATALNTSGAGGSRPAQAVRASNPDLRRSDPGWERSDSVLPASHGH 
NIK : 661 DNEAPPKVPQRTSSIATALNTSGAGGSRPAQAVRASNPDLRRSDPGWERSDSVLPASHGH 720 

N0V3d: 662 LPQAGSLERNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRPASYKRAIGEDFVLLKERT 721 
55 LPQAGSLERNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRPA DFVLLKERT 

NIK : 721 LPQAGSLERNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRPA DFVLLKERT 772 

N0V3d: 722 LDEAPRPPKKAMDYXXXXXXXXXXXXXXXXXXXXXXXXXRDTPGGRSDGDTDSVSTMWH 781 
LDEAPRPPKKAMDY RDTPGGRSDGDTDSVSTMWH 
60 NIK : 773 LDEAPRPPKKAMDYSSSSEEVESSEDDEEEGEGGPAEGSRDTPGGRSDGDTDSVSTMWH 832 

N0V3d: 782 DVEEITGTQPPYGGGTMWQRTPEEERNLLHADSNGYTNLPDWQPSHSPTENSKGQSPP 841 

72 



BNSOOCID: <WO 016292eA2J_> 



wo 01/62928 PCT/USOl/06151 

DVEEITGTQPPYGGGTMWQRTPEEERNLLHADSNGYTNLPDWQPSHSPTENSKGQSPP 
NIK : 833 DVEEITGTQPPYGGGTMWQRTPEEERNLLHADSNGYTNLPDWQPSHSPTENSKGQSPP 892 

N0V3d: 842 SKDGSGDYQSRGLVKAPGKSSFTMFVDLGIYQPGGSGDSIPITALVGGEGTRLDQLQYDV 901 
5 SKDGSGDYQSRGLVKAPGKSSFTMFVDLGIYQPGGSGDSIPITALVGGEGTRLDQLQYDV 

NIK : 893 SKDGSGDYQSRGLVKAPGKSSFTMFVDLGIYQPGGSGDSIPITALVGGEGTRLDQLQYDV 952 

N0V3d: 902 RKGSWNVNPTNTRAHSETPEIRKYKKRFNSEILCAALWGVNLLVGTENGLMLLDRSGQG 961 
RKGSWNVNPTNTRAHSETPEIRKYKKRFiSISEILCAALWGVNLLVGTENGLMLLDRSGQG 
10 NIK : 953 RKGSWNVNPTNTRAHSETPEIRKYKKRFNSEILCAALWGVNLLVGTENGLMLLDRSGQG 
1012 

N0V3d: 962 KVYGLIGRRRFQQMDVLEGLNLLITISGKRNKLRVYYLSWLRNKILHNDPEVEKKQGWTT 
1021 

15 KVYGLIGRRRFQQMDVLEGLNLLITISGKRNKLRVYYLSWLRNKILHNDPEVEKKQGWTT 
NIK : 1013 KVYGLIGRRRFQQMDVIiEGLNLLITISGKRNKLRVYYLSWLRNKILHNDPEVEKKQGWTT 
1072 

. N0V3d: 1022 VGDMEGCGHYRWKYERIKFLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVDL 
20 1081 

VGDMEGCGHYRVVKYERIKFLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVTiL 
NIK : 1073 VGDMEGCGHYRVVKYERIKFLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVDL 
1132 

25 N6V3d: 1082 TVEEGQRLKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQITPHAIIFLPNTDGMEMLLC 
1141 

TVEEGQRLKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQITPHAIIFLPNTDGMEMLLC 
NIK : 1133 TVEEGQRLKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQITPHAIIFLPNTDGMEMLLC 
1192 

30 

NOVSd: 1142 YEDEGVYVNTYGRIIKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVETGHLDGVFM 
1201 

YEDEGVYVNTYGRIIKDVVLQWGEMPTSVAYICSNQIMGWGEBCAIEIRSVETGHLDGVFM 
NIK : 1193 YEDEGVYVNTYGRIIKDVVLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVETGHLDGVFM 
35 1252 
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N0V3d: 1202 HKRAQRLKFLCERNDKVFFASVRSGGSSQVYFMTLNRNCIMNW 1244 (SEQ ID NO: 70) 

HKRAQRLKFLCERNDKVFFASVRSGGSSQVYFMTLNRNCIMNW 
NIK : 1253 HKRAQRLKFLCERNDKVFFASVRSGGSSQVYFMTLNRNCIMNW 1295 (SEQ ID NO: 37) 



TABLE 23 

Score = 2018 bits (5228), Expect = 0.0 

Identities = 1054/1303 (80%), Positives = 1057/1303 (80%), Gaps = 59/1303 
45 (4%) 



N0V3d: 1 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTX 60 

MGDPAPARSLDDIDLSALRDPAGIFELVEVVGNGTYGQVYKGRHVKTGQIiAAIKVMDVT 
GCK : 1 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTE 60 

N0V3d: 61 XXXXXIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 

IKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 
GCK : 61 DEEEEIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 120 

55 N0V3d: 121 KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 

KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 
GCK : 121 KGNALECEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 180 

N0V3d: 181 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 240 
60 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 

GCK : 181 TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 24 0 

73 
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NOV3d: 241 ALFLIPRNPPPRLKSKKWSBCKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 300 

ALFIilPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 
GCK : 241 ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 300 

5 NOV3d: 301 QLKDHIXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXPSSIMNVPGESTLRREFLRLQQ 360 
QLKDH I PSSIMNVPGES TLRRE FLRLQQ 

GCK : 301 QLKDHIDRSRKKRGEECEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQ 360 

N0V3d: 361 ENKSNSEALKXXXXXXXXXXRDPEAHIKHLLHXXXXXXXXXXXXXXXXXXXXXXXXXXXX 420 
10 ENKSNSEALK RDPEAHIKHLLH 

GCK : 361 ENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEEQKEERRRVEEQQRREREQRK 420 

NOV3d: 4 21 XXXXXXXXXXXDMQALXXXXXXXXXXXXXXYIRHXXXXXXXXXXXXXXXXXXHAYLKSXX 4 80 

DMQAL Y R HAYLKS 

.15 GCK : 421 LQEKEQQRRLEDMQALRREEERRQAEREQEYKRKQLEEQRQSERLQRQLQQEHAYLKSLQ 4 80 

N0V3d: 481 XXXXXXXXXXXXXXXXXPGDRKPLYHYGRGMNPADKPAWAREWAH 526 

PGDRKPLYHYGRGMNPADKPAWAREV 
GCK : 481 QQQQQQQLQKQQQQQLLPGDRKPLYHYGRGMNPADKPAWAREVEERTRT-INKQQNSPLAKS 540 



20 



40 



N0V3d: 527 RVPLKPYAAP VPRSQ 541 

+ P++P P VPRSQ 
GCK : 541 KPGSTGPEPPIPQASPGPPGPLSQTPPMQRPVEPQEGPHKSLVAHRVPLKPYAAPVPRSQ 600 



25 N0V3d: 542 SLQDQPTRNLAAFPASHXXXXXXXXXXXXXXXRGAVIRQNSDPTSEGPGPSPNPPAWVRP 601 

SLQDQPTRNLAAFPASH RGAVIRQNSDPTSEGPGPSPNPPAWVRP 
GCK : 601 SLQDQPTRNLAAFPASHDPDPAIPAPTATPSARGAVIRQNSDPTSEGPGPSPNPPAWVRP 660 

N0V3d: 602 DNEAPPKVPQRTSSIATALNTSGAGGSRPAQAVRASNPDLRRSDPGWERSDSVLPASHGH 661 
30 DNEAPPKVPQRTSSIATALNTSGAGGSRPAQAVRASNPDLRRSDPGWERSDSVLPASHGH 

GCK : 661 DNEAPPKVPQRTSSIATALNTSGAGGSRPAQAVRASNPDLRRSDPGWERSDSVLPASHGH 720 

NOV3d: 662 LPQAGSLERNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRPASYKRAIGEDFVLLKERT 721 
LPQAGSLERNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRPASYKRAIGEDFVLLKERT 
35 GCK : 721 LPQAGSLERNRVGVSSKPDSSPVLSPGNKAKPDDHRSRPGRPASYKRAIGEDFVLLKERT 780 

N0V3d: 722 LDEAPRPPKKAMDYXXXXXXXXXXXXXXXXXXXXXXXXXRDTPGGRSDGDTDSVSTMVVH 781 

LDEAPRPPKKAMDY RDTPGGRSDGDTDSVSTMWH 
GCK : 781 LDEAPRPPKKT^DYSSSSEEVESSEDDEEEGEGGPAEGSRDTPGGRSDGDTDSVSTMWH 840 



NOV3d: 782 DVEEITGTQPPYGGGTMWQRTPEEERNLLHADSNGYTNLPDWQPSHSPTENSKGQSPP 841 

DVEEITGTQPPYGGGTMVVQRTPEEERNLLHADSNGYTNLPDWQPSHSPTENSKGQSPP 
GCK : 841 DVEEITGTQPPYGGGTMWQRTPEEERNLLHADSNGYTNLPDVVQPSHSPTENSKGQSPP 900 



45 N0V3d: 8 42 SKDGSGDYQSRGLVKAPGKSSFTMFVDLGIYQPGGSGDSIPITALVGGEGTRLDQLQYDV 901 

SKDGSGDYQSRGLVKAPGKSSFTMFVDLGIYQPGGSGDSIPITALVGGEGTRLDQLQYDV 
GCK : 901 SKDGSGDYQSRGLVKAPGKSSFTMFVDLGIYQPGGSGDSIPITALVGGEGTRLDQLQYDV 960 

NOV3d: 902 RKGSWNVNPTNTRAHSETPEIRKYKKRFNSEILCAALWGVNLLVGTENGLMLLDRSGQG 961 
50 RKGSWNVNPTNTRAHSETPEIRKYKKRFNSEILCAALWGVNLLVGTENGLMLLDRSGQG 
GCK : 961 RKGSWNVNPTNTRAHSETPEIRKYKKRFNSEILCAALWGVNLLVGTENGLMLLDRSGQG 
1020 

N0V3d : 9 62 KVYGLIGRRRFQQMDVLEGLNLLITISGKRNKLRVYYLSWLRNKILHNDPEVEKKQGWTT 
55 1021 

KVYGLIGRRRFQQMDVLEGLNLLITISGKRNKLRVYYLSWLRNKILHNDPEVEKKQGWTT 
GCK : 1021 KVYGLIGRRRFQQMDVLEGLNLLITISGKRNKLRVYYLSWLRNKILHNDPEVEKKQGWTT 
1080 

60 N0V3d: 1022 VGDMEGCGHYRVVKYERIKFLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVDL 
1081 

VGDMEGCGHYRWKYERIKFLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVDL 
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GCK : 1081 VGDMEGCGHYRWKYERIKFLVIALKSSVEVYAWAPKPYHKFMAFKSFADLPHRPLLVDL 
1140 

N0V3d : 1082 TVEEGQRLKVI YGSSAGFHAVDVDSGNSYDI YIPVHIQSQITPHAIIFLPNTDGMEMLLC 
5 1141 

TVEEGQRLKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQITPHAIIFLPNTDGMEMLLC 
GCK : 1141 TVEEGQRLKVI YGSSAGFHAVDVDSGNSYDI YIPVHIQSQITPHAIIFLPNTDGMEMLLC 
1200 

10 NOV3d : 1142 YEDEGVYVNTYGRI IKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVETGHLDGVFM 
1201 

YEDEGVYVNTYGRIIKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVETGHLDGVFM 
GCK : 1201 YEDEGVYVNTYGRIIKDWLQWGEMPTSVAYICSNQIMGWGEKAIEIRSVETGHLDGVFM 



15 
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1260 

NOV3d: 1202 HKRAQRLKFLCERNDKVFFASVRSGGSSQVYFMTLNRNCIMNW 1244 (SEQ ID NO: 71) 

HKRAQRLKFLCERNDKVFFASVRSGGSSQVYFMTLNRNCIMNW 
GCK ; 1261 HKRAQRLKFLCERNDKVFFASVRSGGSSQVYFMTLNRNCIMNW 1303 (SEQ ID NO: 38) 



Based on its relatedness to known members of the STE20 family of protein kinases, 
NOV3d provides new diagnostic and therapeutic compositions useful in the treatment of 
disorders associated with alterations in the expression of members of the STE20 family of 
protein kinases. Nucleic acids, polypeptides, antibodies, and other compositions of the present 

25 invention are useful in the treatment and diagnosis of a variety of diseases and pathologies, 

including, by way of nonlimiting example, those involving metabolic and endocrine disorders, 
cancer, bone disorders, and tissue/cell growth regulation disorders. 

Table 24 shows a multiple sequence alignment of the disclosed NOV-3 polypeptides 
with a STE20 protein (GenBank Accession No: BAA90753), indicating the homology 

30 between the present invention and a known member of the protein family. 

TABLE 24 

STE20 MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTE 
NOV3b MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQIiAAIKVMDVTE 
35 NOV3a MGDPAPARSLDDI DLSALRDPAGI FELVEWGNGT YGQVYKGRHVKTGQLTU^IKVMDVTE 

NOV3d MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQLAAIKVMDVTE 
NOV3c MGDPAPARSLDDIDLSALRDPAGIFELVEWGNGTYGQVYKGRHVKTGQIAAIKVMDVTE 

40 STE20 DEEEEIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 

NOV3b DEEEEIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 

NOVSa DEEEEIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 

NOV3d DEEEEIKQEINMLKKYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 

NOV3c DEEEEIKQEINMLKECYSHHRNIATYYGAFIKKSPPGNDDQLWLVMEFCGAGSVTDLVKNT 

STE20 KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 

NOV3b KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 

NOV3a KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVL.LTENAEVKLVDFGVSAQLDR 

50 N0V3d KGNALKEDCIAYICREILRGLAHLHAHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 

N0V3c KGNALKEDCIAYICREILRGLAHLHAHBCVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDR 
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STE20 
NOV3b 
N0V3a 
NOV3d 
NOVBc 



TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 

TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 

TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 

TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 

TVGRRNTFIGTPYWMAPEVIACDENPDATYDYRSDIWSLGITAIEMAEGAPPLCDMHPMR 
************************************************************ 
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STE20 
NOVSb 
NOVSa 
NOVSd 
NOV3C 



STE20 
NOV3b 
N0V3a 
NOV3d 
NOV3C 



ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 
ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 
ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 
ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 
ALFLIPRNPPPRLKSKKWSKKFIDFIDTCLIKTYLSRPPTEQLLKFPFIRDQPTERQVRI 

QLKDHIDRSRKKRGEKEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGEStLRREFLRLQQ 
QLKDHIDRSRKKRGEKEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQ 
QLKDHIDRSRKKRGEKEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQ 
QLKDHIDRSRKKRGEKEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQ 
QLKDHIDRSRKKRGEKEETEYEYSGSEEEDDSHGEEGEPSSIMNVPGESTLRREFLRLQQ 



25 



STE20 
NOV3b 
NOV3a 
N0V3d 
NOVSc 



ENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEEQKEERRRVEEQQRREREQRK 
ENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEEQKEERRRVEEQQRREREQRK 
ENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEEQKEERRRVEEQQRREREQRK 
ENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEEQKEERRRVEEQQRREREQRK 
ENKSNSEALKQQQQLQQQQQRDPEAHIKHLLHQRQRRIEEQKEERRRVEEQQRREREQRK 
************************************************************ 
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40 



STE20 
NOV3b 
NOV3a 
N0V3d 
N0V3c 



-STE20 
NOV3b 
NOV 3 a 
NOV3d 
NOV3C 



LQEKEQQRRLEDMQAIiRREEERRQAEREQEYKRKQLEE . 

LQEKEQQRRLEDMQALRREEERRQAEREQEYIRHRLEE 

LQEKEQQRRLEDMQALRREEERRQAEREQEYIRHRLEEEQRQLEILQQQLLQEQALLLEY 

LQEKEQQRRLEDMQALRREEERRQAEREQEYIRHRLEE 

LQEKEQQRRLEDMQALRREEERRQAEREQEYIRHRLEEEQRQLEILQQQLLQEQALLLEY 

QRQSERLQRQLQQEHAYLKSLQQQQQQQQLQKQQQQQLLPGDRKPLYHYGRGM 

QRQSERLQRQLQQEHAYLKSLQQQQQQQQLQKQQQQQLLPGDRKPLYHYGRGM 

KRKQLEEQRQSERLQRQLQQEHAYLKSLQQQQQQQQLQKQQQQQLLPGDRKPLYHYGRGM 

QRQSERLQRQLQQEHAYLKSLQQQQQQQQLQKQQQQQLLPGDRKPLYHYGRGM 

KRKQLEEQRQSERLQRQLQQEHAYLKSLQQQQQQQQLQKQQQQQLLPGDRKPLYHYGRGM 



45 
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STE20 
NOV3b 
NOV3a 
N0V3d 
NOV3C 



STE20 
NOV.3b 
N0V3a 
NOV3d 
NOV3C 



NPADKPAWAREVEERTRMNKQQNSPLAKSKPGSTGPEPPIPQASPGPPGPLSQTPPMQRP 
NPADKPAWAREVEERTRMNKQQNSPLAKSKPGSTGPEPPIPQASPGPPGPIiSQTPPMQRP 
NPADKPAWAREVEERTRMNKQQNSPLAKSKPGSTGPEPPIPQASPGPPGPLSQTPPMQRP 

NPADKPAWAREV ' 

NPADKPAWAREV 

•k * * * * -k -k * * * -k -k 

VEPQEGPHKSLVAHRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASHDPDPAIPAPTATPS 
VEPQEGPHKSLVAHRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASHDPDPAIPAPTATPS 
VEPQEGPHKSLVAHRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASHDPDPAIPAPTATPS 

VAHRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASHDPDPAIPAPTATPS 

VAHRVPLKPYAAPVPRSQSLQDQPTRNLAAFPASHDPDPAIPAPTATPS 

************************************************* 
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STE20 
NOV3b 
N0V3a 
N0V3d 
NOV3C 



ARGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 
ARGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 
ARGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPPCVPQRTSSIATALNTSGAGGSRPAQ 
ARGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 
ARGAVIRQNSDPTSEGPGPSPNPPAWVRPDNEAPPKVPQRTSSIATALNTSGAGGSRPAQ 
************************************************************ 
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STE20 
NOV3b 
N0V3a 
NOV3d 
N0V3C 



AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 

AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 

AVRASNFDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK* 

AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 

AVRASNPDLRRSDPGWERSDSVLPASHGHLPQAGSLERNRVGVSSKPDSSPVLSPGNKAK 
************************************************************ 
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STE20 
N0V3b 
N0V3a 
NOV3d 
N0V3c 



STE20 
^30V3b 
N0V3a 
N0V3d 
NOV3C 



PDDHRSRPGRPA DFVLLKERTLDEAPRPPKKAMDYSSSSEEVESSEDDEEEG 

PDDHRSRPGRPASYKRAIGEDFVLLKERTLDEAPRPPKKAMDYSSSSEEVESSEDDEEEG 
PDDHRSRPGRPASYKRAIGEDFVLLKERTLDEAPRPPKECAMDYSSSSEEVESSEDDEEEG 
PDDHRSRPGRPASYKRAIGEDFVLLKERTLDEAPRPPEOCAMDYSSSSEEVESSEDDEEEG 
PDDHRSRPGRPASYKRAIGEDFVLLKERTLDEAPRPPKKAMDYSSSSEEVESSEDDEEEG 

-k-k-k-k-k^c-k-k-k*** ******'k-k***'k-k*'k****k'k'kk'k-k*k-k*k^'k'k*-k-k'k-k'k^ 

EGGPAEGSRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 
EGGPAEGSRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMVVQRTPEEERNLLH 
EGGPAEGSRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 
EGGPAEGSRDTPGGRSDGDTDSVSTMWHDVEEITGTQPPYGGGTMWQRTPEEERNLLH 
EGGPAEGSRDTPGGRSDGDTDSVSTiyiWHDVEEITGTQPPYGGGTMVVQRTPEEERNLLH 

k**-kkkkk*k*kk**'kkk*kk*kkk'k'kk'k'k-k**-k'k'k*'kkk*k*'k*'k*kk'k'kk*kk'li'k'k'kk 
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30 



35 



STE20 
N0V3b 
NOV3a 
N0V3d 
NOV3c 



STE20 
N0V3b 
NOV3a 
N0V3d 
N0V3C 



ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 
ADSNGYTNLPDVVQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 
ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 
ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 
ADSNGYTNLPDWQPSHSPTENSKGQSPPSKDGSGDYQSRGLVKAPGKSSFTMFVDLGIY 

•k'kkk'k-k*'kkk'k'k*kkk*k'k**k-kkk-k-k'k**k*-k-k'kkkk-kk-kkk'k*'kk*k-k^**-k*k'k'k'kk 

QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 
QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 
QPGGSGDSiPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 
QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 
QPGGSGDSIPITALVGGEGTRLDQLQYDVRKGSWNVNPTNTRAHSETPEIRKYKKRFNS 



40 



STE20 
N0V3b 
N0V3a 
N0V3d 
NOV3C 



EILCAALWGVNLLVGTENGLMLLDRSGQGECVYGLIGRRRFQQMDVLEGLNLLITISGKRN 
EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGKRN 
EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGKRN 
EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGKRN 
EILCAALWGVNLLVGTENGLMLLDRSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGKRN 

**kkkkk**k*-kk****kk*kkkkkkkkk**kkk*k**kkkkkk^irk*kkkirkk*kk^k*k 
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STE20 
NOV 3b 
NOV3a 
NOV3d 
NOV3C 



STE20 
NOV3b 
NOV3a 
NOV3d 
NOV3C 



KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRVVKYERIKFLVIALKSSVEV 
KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRWKYERIKFLVIALKSSVEV 
KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRVVKYERIKFLVIALKSSVEV 
KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRVVKYERIKFLVIALKSSVEV 
KLRVYYLSWLRNKILHNDPEVEKKQGWTTVGDMEGCGHYRWKYERIKFLVIALKSSVEV 

'k-k'kk'kkk4e'k-k-kkk'k'kkkk*-kk***k'k'k-k'k'k*kk*'kk*'k*kk-kk'k-kk'kk*k'k******-k'k'k 

YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 
YAWAPKPYHKFMAFKSFADLPHRPLLVDLTVEEGQRLKVIYGSSAGFHAVDVDSGNSYDI 

'kkkk-kkk^ic^'k-kk'k'k-kkkkkitkk'kk'k'k-k-k-kkkkicrk'kk-k-k-k-kkk-k-kk'kkk^'kkk'k-kk'k'kk-k 
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STE20 
NOV3b 
N0V3a 
N0V3d 
N0V3C 



y I PVHIQSQITPHAI I FLPNTDGMEMLLC YEDEGVYVNT YGRI IKDWLQWGEMPTSVAY 
YI PVHIQSQITPHAl I FLPNTDGMEMLLC YEDEGVYVNT YGRI IKDWLQWGEMPTSVAY 
YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 
YIPVHIQSQITPHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIKDWLQWGEMPTSVAY 
YI PVHIQSQITPHAI I FLPNTDGMEMLLC YEDEGVYVNTYGRI IKDWLQWGEMPTSVAY 

****************************************** kki.k^k'k'k'kkkk-k^'kk^ic 
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STE20 ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 

N0V3b ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 

N0V3a ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 

5 N0V3d ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 

NOVSc ICSNQIMGWGEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERNDKVFFASVRSGGSSQVY 
******************************************************** 

STE20 FMTLNRNCIMNW (SEQ ID NO: 39) 

10 N0V3b FMTLNRNCIMNW (SEQ ID NO: 9) 

N0V3a FMTLNRNCIMNW (SEQ ID NO: 7) 

N0V3d FMTLNRNCIMNW (SEQ ID NO: 13) 

N0V3c FMTLNRNCIMNW (SEQ ID NO: 11) 

************ 

IS Consensus key 

* - single, fiiUy conserved residue 
: - conservation of strong groups 
. - conservation of weak groups 
- no consensus 

20 

Based on the relatedness between NOV-3 and STE20 kinases, the disclosed NOV3 
. proteins are novel members of the STE20 protein kinase family. Therefore, the nucleic acids 
and proteins of the inventions are useful in potential therapeutic applications implicated in 
various pathologies and disorders described and other pathologies and disorders related to 

25 aberrant function or aberrant expression of these STE20-protein kinases. 

Potential therapeutic uses for the nucleic acids and proteins of the invention include, by 
way of nonlimiting example, protein therapeutic, small molecule drug target, antibody target 
(including therapeutic, diagnostic, or drug targeting/cytotoxic antibodies), diagnostic and/or 
prognostic marker, gene therapy (gene delivery/gene ablation), research tools, and tissue 

30 regeneration in vitro and in vivo (regeneration for all these tissues and cell types composing 
these tissues and cell types derived from these tissues). 

The nucleic acids and proteins of the invention are useful in potential therapeutic 
applications implicated in various names of pathologies/disorders described above, as well as 
other pathologies or disorders. For example, a cDNA encoding the STE20 protein kinase-like 

35 protein may be useful in gene therapy, and the STE20 protein kinase-like protein may be 
useful when administered to a subject in need thereof. By way of nonlimiting example, the 
compositions of the present invention will have efficacy for treatment of patients suffering 
from the pathologies described above. The novel nucleic acids encoding the STE20 protein 
kinase-like proteins, and the STE20 protein kinase-like proteins of the invention, or fragments 

40 thereof, may further be useful in diagnostic applications, wherein the presence or amount of 
the nucleic acid or the protein are to be assessed. These materials are further useful in the 
generation of antibodies that bind immunospecifically to the novel substances of the invention 
for use in therapeutic or diagnostic methods. 
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NOV-4: A Novel Trypsin Inhibitor-like protein 

The NOV-4 sequences (NOV-4a, NOV-4b, NOV-4c, NOV-4d, and N0V-4e) 
according to the invention are nucleotide sequences encoding respective polypeptides related 
5 to trypsin inhibitor proteins. 

The disclosed NOV-4 sequences are splice variants. Splice variants occur naturally. 
When a variant and the original sequence have the same or opposite activity, they may differ 
in various properties not directly connected to biological activity. A certain variant may be 
expressed mainly in one tissue, while the original sequence from which it has been varied, or 

10 another variant derived from the same sequence, may be expressed mainly in another tissue. 
The presence or level of specific splice variants may be the cause, and/or indicative of, a 
disease, disorder, pathological or normal condition. ^ 

Because a dmg may be effective against one variant but not another, or may cause side 
effects because it targets all splice variants, an effective drug needs to target the particular 

15 splice variant. Because soluble variants with therapeutic or disease-related functions may be 
naturally occurring in specific tissues, they may be optimal candidates for drug targets or 
protein therapeutics. Variants may have no activity at all and may serve as dominant negative 
natural inhibitors. Thus, splice variants useful in generating new drug targets, protein 
therapeutics and markers for diagnostics. 

20 NOV-4 sequences according to the invention encode polypeptides related to trypsin 

inhibitor proteins that are expressed in brain tumors, polypeptides related to sperm coat 
glycoproteins, and polypeptides related to glioma pathogenesis related proteins. See 
Yamakawa et al., 1998, Biochim Biophys Acta 1395(2):202-8; Murphy et al., 1995, Gene 
159(1): 131-5. In addition, similarities were found between NOV-4 and insect allergens in 

25 wasps, hornets, fire ants, and secreted/membrane proteins in nematode pathogens. See J 

Allergy Clin Immunol 1990, 85(6):988-96. Therefore, the nucleic acids and proteins of the 
NOV-4 splice variants described in this invention can have similar functions as these proteins. 

NOV-4 proteins are expressed in the following tissues: pituitary gland, mammary 
gland, adrenal gland, thalamus, and fetal lung. 

30 Functional roles attributed to trypsin inhibitor proteins include sperm coat maturation, 

immunological responses, glioma pathogenesis, and signal transduction pathways. Thus, 
NOV-4 nucleic acids and polypeptides, antibodies and related compounds according to the 
invention will be useful in therapeutic and diagnostic applications in disorders associated with, 
e.g., reproductive disorders, immunological disorders, cancer, and metabolic disorders. 
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Additional utilities for NOV-4 nucleic acids and polypeptides according to the invention are 
disclosed herein. 



NOV-4a 

A NOV-4a sequence according to the invention is a nucleic acid sequence that encodes 
a polypeptide related to trypsin inhibitor proteins. A disclosed NOV-4a nucleic acid and its 
encoded polypeptide is included in Table 25. The disclosed nucleic acid (SEQ ID NO: 14) is 
2305 nucleotides in length and contains an open reading frame (ORF) that begins with an 
ATG initiation codon at nucleotide 453, and ends with a TGA stop codon at nucleotide 1602. 
A disclosed, representative ORF encodes a 383 amino acid polypeptide (SEQ ID NO: 15). 
NOV-4a is missing one exon in the 5' nucleotide region compared to other splice variants 
(NOV-4b and NOV-4c), resulting in an altemative methionine start codon and a Kozak-' 
sequence. 

TABLE 25 

CTCTGACTGCTCCTATTGAGCTGTCTGCTCGCTGTGCCCGCTGTGCCTGCTGTGCCCGCG 
CTGTCGCCGCTGCTACCGCGTCTACTGGACGCGGGAGACGCCAGCGAGCTGGTGATTGGA 
GCCCTGCGGAGAGCTCAAGCGCCCAGCTCTGCCCGAGGAGCCCAGGCTGCCCCGTGAGTC 
CCATAGTTGCTACAGGAGTGGAGCCATGAGCTGCGTCCTGGGTGGTGTCATCCCCTTGGG 
GCTGCTGTTCCTGGTCCGCGGATCCCAAGGCTACCTCCTGCCCAACGTCACTCTCTTAGA 
GGAGCTGCTCAGCAAATACCAGCACAACGAGTCTCACTCCCGGGTCCGCAGAGCCATCCC 
CAGGGAGGACAAGGAGGAGATCCTCATGCTGCACAACAAGCTTCGGGGCCAGGTGCAGCC 
TCAGGCCTCCAACATGGAGTACATGACCTGGGATGACGAACTGGGGCAGGTATCGCTCTC 
CGGGGTTCCATGTGCAGTCCTGGTATGACGAGGTGAAGGACTACACCTACCCCTACCCGA 
GCGAGTGCAACCCCTGGTGTCCAGAGAGGTGCTCGGGGCCTATGTGCACGCACTACACAC 
AGATAGTTTGGGCCACCACCAACAAGATCGGTTGTGCTGTGAACACCTGCCGGAAGATGA 
CTGTCTGGGGAGAAGTTTGGGAGAACGCGGTCTACTTTGTCTGCAATTATTCTCCAAAGG 
GGAACTGGATTGGAGAAGCCCCCTACAAGAATGGCCGGCCCTGCTCTGAGTGCCCACCCA 
GCTATGGAGGCAGCTGCAGGAACAACTTGTGTTACCGAGAAGAAACCTACACTCCAAAAC 
CTGAAACGGACGAGATGAATGAGGTGGAAACGGCTCCCATTCCTGAAGAAAACCATGTTT 
GGCTCCAACCGAGGGTGATGAGACCCACCAAGCCCAAGAAAACCTCTGCGGTCAACTACA 
TGACCCAAGTCGTCAGATGTGACACCAAGATGAAGGACAGGTGCAAAGGGTCCACGTGTA 
ACAGGTACCAGTGCCCAGCAGGCTGCCTGAACCACAAGGCGAAGATCTTTGGAACTCTGT 
TCTATGAAAGCTCGTCTAGCATATGCCGCGCCGCCATCCACTACGGGATCCTGGATGACA 
AGGGAGGCCTGGTGGATATCACCAGGAACGGGAAGGTCCCCTTCTTCGTGAAGTCTGAGA 
GACACGGCGTGCAGTCCCTCAGCAAATACAAACCTTCCAGCTCATTCATGGTGTCAAAAG 
TGAAAGTGCAGGATTTGGACTGCTACACGACCGTTGCTCAGCTGTGCCCGTTTGAAAAGC 
CAGCAACTCACTGCCCAAGAATCCATTGTCCGGCACACTGCAAAGACGAACCTTCCTACT 
GGGCTCCGGTGTTTGGAACCAACATCTATGCAGATACCTCAAGCATCTGCAAGACAGCCG 
TGCACGCGGGAGTCATCAGCAACGAGAGTGGGGGTGACGTGGACGTGATGCCCGTGGATA 
AAAAGAAGACCTACGTGGGCTCGCTCAGGAATGGAGTTCAGTCTGAAAGCCTGGGGACTC 
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CTCGGGATGGAAAGGCCTTCCGGATCTTTGCTGTCAGGCAGTGAATTTCCAGCACCAGGG 
GAGAAGGGGCGTCTTCAGGAGGGCTTCGGGGTTTTGCTTTTATTTTTATTTTGTCATTGC 
GGGGTATATGGAGAGTCAGGAAACTTCCTTTGACTGATGTTCAGTGTCCATCACTTTGTG 
GCCTGTGGGTGAGGTGACATCTCATCCCCTCACTGAAGCAACAGCATCCCAAGGTGCTCA 
5 GCCGGACTCCCTGGTGCCTGATCCTGCTGGGGCCCGGGGGTCTCCATCTGGACGTCCTCT 
CTCCTTTAGAGATCTGAGCTGTCTCTTAAAGGGGACAGTTGCCCAAAATGTTCCTTGCTA 
TGTGTTCTTCTGTTGGTGGAGGAAGTTGATTTCAACCTCCCTGCCAAAAGAACAAACCAT 
TTGAAGCTCACAATTGTGAAGCATTCACGGCGTCGGAAGAGGCCTTTTGAGCAAGCGCCA 
ATGAGTTTCAGGAATGAAGTAGAAGGTAGTTATTTAAAAATAAAAAACACAGTCCGTCCC 
10 TACCAATAGAGGAAAATGGTTTTAATGTTTGCTGGTCAGACAGACAAATGGGCTAGAGTA 
AGAGGGCTGCGGGTATGAGAGACCCCGGCTCCGCCCTGGCACGTGTCCTTGCTGGCGGCC 
CGCCACAGGCCCCCTTCAATGGCCGCATTCAGGATGGCTCTATACACAGCAGTGCTGGTT 
TATGTAAAGTTCAGCAGTCACTTCA (SEQ ID NO: 14) 

15 MTNWGRYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQIVWATTN 
KIGCAVNTCRKMTVWGEVWENAVYFVCNYSPKGNWIGEAPYBCNGRPCSECPPSYGGS 
CRNNLCYREETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMT 
QWRCDTBOyiKDRCKGSTCNRYQCPAGCLNHKAKIFGTLFYESSSSICRAAIHYGILD 
DKGGLVDITRNGKVPFFVKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLC 

20 PFEKPATHCPRIHCPAHCKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGD 
VDVMPVDKKKTYVGSLRNGVQSESLGTPRDGKAFRIFAVRQ (SEQ ID NO: 15) 

The disclosed NOV-4a amino acid sequence has a high level of homology (99% identity, 
99% similarity) to a himian trypsin inhibitor-like protein (GenBank Accession No: 
25 CAB66795), shown in Table 26. Moreover, the NOV-4a amino acid sequence has homology 
(72% identity, 82% similarity) to a known human trypsin inhibitor (TREMBL ACC No: 
043692), also shown in Table 26. As indicated by the "Expect" values, the probability of 
these aUgnments occurring by chance alone is 0.0 and 5.3e-5 1, respectively. 

30 TABLE 26 

• Score = 786 bits (2031), Expect ^ 0.0 

Identities = 380/381 (99%), Positives « 381/381 (99%) 



35 
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N0V4a: 3 NWGRYRSPGrHVQSWYDEVKDyTYPYPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCA 62 

+WGRYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCA 
TRYP : 117 HWGRYRSPX3FHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCA 17 6 



N0V4a: 63 VNTCRKMTVWGEVWENAVYFVCNYSPKGNWIGEAPYKNGRPCSECPPSYGGSCRNNLCYR 122 
VNTCRKMTVWGEVWENAVYFVCNYSPKGNWIGEAPYKNGRPCSECPPSYGGSCRNNLCYR 
40 TRYP : 177 VNTCRKMTVWGEVWENAVYFVCNYSPKGNWIGEAPYKNGRPCSECPPSYGGSCRNNLCYR 23 6 



N0V4a: 123 EETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQWRCDTKMKD 182 

EETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQWRCDTKMKD 
TRYP : 237 EETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQWRCDTKMKD 296 

N0V4a: 183 RCKGSTCNRYQCPAGCLNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGKV 242 

RCKGSTCNRYQCPAGCLNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGKV 
TRYP : 297 RCKGSTCNRYQCPAGCLNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGKV 356 
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N0V4a: 243 PFFVKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAH 302 

PFFVKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAH 
TRYP : 357 PFFVKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAH 416 

5 NOV4a: 303 CKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNGV 362 

CKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNGV 
TRYP : 417 CKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNGV 47 6 

N0V4a: 363 QSESLGTPRDGKAFRIFAVRQ 383 (SEQ ID NO: 72) 
10 QSESLGTPRDGKAFRIFAVRQ 

TRYP : 477 QSESLGTPRDGKAFRIFAVRQ 497 (SEQ ID NO: 40) 



• Score = 530 (186.6 bits). Expect = 5.3e-51, P = 5.3e-51 
15 Identities = 85/117 (72%) , Positives = 91/1X1 (82%) 

N0V4a: 5 GRYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCAVN 64 

GRYRS V+ WYDEVKDY +PYP +CNP CP RC GPMCTHYTQ+VWAT+N+IGCA+ + 
TRYP : 130 GRYRSILQLVKPWYDEVKDYAFPYPQDCNPRCPr4RCFGPMCTHYTQMVWATSNRIGCAIH 189 

20 

NOV4a: 65 TCRKMTVWGEVWENAVYFVCNYSPKGNWIGEAPYKNGRPCSECPPSYGGSCRNNIiCY 121 (SEQ 

ID NO: 73) ✓ 

TC+ M VWG VW AVY VCNY+PKGNWIGEAPYK G PCS CPPSYGGSC +NLC+ 
TRYP : 190 TCQNMNVWGSVWRRAVYLVCNYAPKGNWIGEAPYKVGVPCSSCPPSY(3GSCTDNt.CF 246 (SEQ 
25 ID NO: 41) 



Furthermore, a PROSITE database search of protein families and domains confirmed 
that a NOV-4a polypeptide is a member of the trypsin inhibitor family. One of the conserved 
regions found in trypsin inhibitors is a SCP domain, located at the C-terminal half. The 
30 pattern of this conserved domain is: [LrVMFYH]-[LI\TV[FY]-x-C-|>rQRHS]-Y-x-P 

[GL]-N-[LIVMFYWDN] (SEQ ID NO: 56). This pattern is found in amino acids 81-92 of 
SEQ ID NO: 15. 

PSORT analyses indicate that that NOV-4a is likely located in the nucleus (certainty = 
0.3000). The predicted molecular weight of NOV-4a is 43185.7 daltons. 

35 Based on its relatedness to known members of the trypsin inhibitor family of proteins, 

NO^ fAcL pro VI des nev/ diagnostic and therapeutic compositions useful in the treatment of 
disorders associated with alterations in the expression of members of the trypsin inhibitor 
protein family. Nucleic acids, polypeptides, antibodies, and other compositions of the present 
invention are useful in the treatment and diagnosis of a variety of diseases and pathologies, 

40 including, by way of nonlimiting example, those involving reproductive disorders, 
immimological disorders, cancer, and metabolic disorders. 



NOV-4b 

A disclosed NOV-4b sequence according to the invention is a nucleic acid sequence 

45 that encodes a polypeptide related to trypsin inhibitor proteins. A disclosed NOV-4b nucleic 

acid and its encoded polypeptide are included in Table 27. The disclosed nucleic acid (SEQ 

ID NO: 16) is 2400 nucleotides in length and contains an open reading frame (ORF) that 
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begins with an ATG initiation codon at nucleotide 205, and ends with a TGA stop codon at 
nucleotide 1697. A disclosed, representative ORF encodes a 497 amino acid polypeptide 
(SEQIDNO: 17). 



5 TABLE 27 

CTCTGACTGCTCCTATTGAGCTGTCTGCTCGCTGTGCCCGCTGTGCCTGCTGTGCCCGCGCTGTCGCCGCTGCTACCGCG 
TCTACTGGACGCGGGAGACGCCAGCGAGCTGGTGATTGGAGCCCTGCGGAGAGCTCAAGCGCCCAGCTCTGCCCGAGGAG 
CCCAGGCTGCCCCGTGAGTCCCATAGTTGCTGCAGGAGTGGAGCCATGAGCTGCGTCCTGGGTGGTGTCATCCCCTTGGG 
GCTGCTGTTCCTGGTCCGCGGATCCCAAGGCTACCTCCTGCCCAACGTCACTCTCTTAGAGGAGCTGCTCAGCAAATACC 

10 AGCACAACGAGTCTCACTCCCGGGTCCGCAGAGCCATCCCCAGGGAGGACAAGGAGGAGATCCTCATGCTGCACAACAAG 
CTTCGGGGCCAGGTGCAGCCTCAGGCCTCCAACATGGAGTACATGACCTGGGATGACGAACTGGAGAAGTCTGCTGCAGC 
GTGGGCCAGTCAGTGCATCTGGGAGCACGGGCCCACCGGTCTGCTGGTGTCCATCGGGCAGAACCTGGGCGCTCACTGGG 
GCAGGTATCGCTCTCCGGGGTTCCATGTGCAGTCCTGGTATGACGAGGTGAAGGACTACACCTACCCCTACCCGAGCGAG 
TGCAACCCCTGGTGTCCAGAGAGGTGCTCGGGGCCTATGTGCACGCACTACACACAGATAGTTTGGGCCACCACCAACAA 

15 GATCGGTTGTGCTGTGAACACCTGCCGGAAGATGACTGTCTGGGGAGAAGTTTGGGAGAACGCGGTCTACTTTGTCTGCA 
ATTATTCTCCAAAGGGGAACTGGATTGGAGAAGCCCCCTACAAGAATGGCCGGCCCTGCTCTGAGTGCCCACCCAGCTAT 
GGAGGCAGCTGCAGGAACAACTTGTGTTACCGAGAAGAAACCTACACTCCAAAACCTGAAACGGACGAGATGAATGAGGT 
GGAAACGGCTCCCATTCCTGAAGAAAACCATGTTTGGCTCCAACCGAGGGTGATGAGACCCACCAAGCCCAAGAAAACCT 
CTGCGGTCAACTACATGACCCAAGTCGTCAGATGTGACACCAAGATGAAGGACAGGTGCAAAGGGTCCACGTGTAACAGG 

20 TACCAGTGCCCAGCAGGCTGCCTGAACCACAAGGCGAAGATCTTTGGAAGTCTGTTCTATGAi\AGCTCGTCTAGCATATG 
CCGCGCCGCCATCCACTACGGGATCCTGGATGACAAGGGAGGCCTGGTGGATATCACCAGGAACGGGAAGGTCCCCTTCT 
TCGTGAAGTCTGAGAGACACGGCGTGCAGTCCCTCAGCAAATACAAACCTTCCAGCTCATTCATGGTGTCAAAAGTGAAA 
GTGCAGGATTTGGACTGCTACACGACCGTTGCTCAGCTGTGCCCGTTTGAAAAGCCAGCAACTCACTGCCCAAGAATCCA 
TTGTCCGGCACACTGCAAAGACGAACCTTCCTACTGGGCTCCGGTGTTTGGAACCAACATCTATGCAGATACCTCAAGCA 

25 TCTGCAAGACAGCCGTGCACGCGGGAGTCATCAGCAACGAGAGTGGGGGTGACGTGGACGTGATGCCCGTGGATAAAAAG 
AAGACCTACGTGGGCTCGCTCAGGAATGGAGTTCAGTCTGAAAGCCTGGGGACTCCTCGGGATGGAAAGGCCTTCCGGAT 
CTTTGCTGTCAGGCAGTGAATTTCCAGCACCAGGGGAGAAGGGGCGTCTTCAGGAGGGCTTCGGGGTTTTGCTTTTATTT 
TTATTTTGTCATTGCGGGGTATATGGAGAGTCAGGAAACTTCCTTTGACTGATGTTCAGTGTCCATCACTTTGTGGCCTG 
TGGGTGAGGTGACATCTCATCCCCTCACTGAAGCAACAGCATCCCT^GGTGCTCAGCCGGACTCCCTGGTGCCTGATCCT 

30 GCTGGGGCCCGGGGGTCTCCATCTGGACGTCCTCTCTCCTTTAGAGATCTGAGCTGTCTCTTAAAGGGGACAGTTGCCCA 
AAATGTTCCTTGCTATGTGTTCTTCTGTTGGTGGAGGAAGTTGATTTCAACCTCCCTGCCAAAAGAACAAACCATTTGAA 
GCTCACAATTGTGAAGCATTCACGGCGTCGGAAGAGGCCTTTTGAGCAAGCGCCAATGAGTTTCAGGAATGAAGTAGAAG 
GTAGTTATTTAAAAATAAAAAACACAGTCCGTCCCTACCAATAGAGGAAAATGGTTTTAATGTTTGCTGGTCAGACAGAC 
AAATGGGCTAGAGTAAGAGGGCTGCGGGTATGAGAGACCCCGGCTCCGCCCTGGCACGTGTCCTTGCTGGCGGCCCGCCA 

35 CAGGCCCCCTTCAATGGCCGCATTCAGGATGGCTCTATACACAGCAGTGCTGGTTTATGTAAAGTTCAGCAGTCACTTCA 

(SEQIDNO: 16) 

f 

MSCVLGGVIPLGLLFLVRGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAIPREDKEEILMLHNKLRGQVQPQASNM 
EYMTWDDELEKSAAAWASQCIWEHGPTGLLVSIGQNLGAHWGRYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERC 
40 SGPMCTHYTQIVWATTNKIGCAVNTCRKMTVWGEVWENAVYFVCNYSPKGNWIGEAPYKNGRPCSECPPSYGGSCRN 
NLCYR£ETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQWRCDTKMKDRCKGSTCNR 
PAGCLNHKAKIFGSLHTESSSSICRAAIHYGILDDKGGLVDITRNGKVPFFVKSERHGVQSLSKYKPSSSFMVSKVK 
VQDLDCYTTVAQLCPFEKPATHCPRIHCPAHCKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPV 
DKKKTYVGSLRNGVQSESLGTPRDGKAFRIFAVRQ (SEQIDNO: 17)" 
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The disclosed NOV-4b amino acid sequence has 124 of 191 amino acid residues (64%) 
identical to, and 148 of 191 (77%) similar to, a known human trypsin inhibitor (TREMBL 
ACC No: 043692), as shown in Table 28. As indicated by the "Expect" value, the probability 
of this alignment occurring by chance alone is 6.1e-73, which is a very low probability score. 

TABLE 28 

Score = 737 (259.4 bits), Expect - 6.1e-73, P = 6.1e-73 
Identities = 124/191 (64%), Positives = 148/191 (77%) 

N0V4b: 45 SRVRRAIPREDKEEIL^^JHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASOCIWEHGPT 104 

+R +R I + D IIj HN++RG+V P A+NMEYM WD+ L KSA AWA+ CIW+HGP+ 
TRYP : 56 ARRKRYISQNDMIAIlJDYHNQVRGKVFPPAANMEYMVWDENIiAKSAEAW^ 115 



15 NOV4b: 105 GLLVSIGQNLGAHWGRYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQ 164 

LL +GQNL GRYRS V+ WYDEVKDY +PYP +CNP CP RC GPMCTHYTQ ^ 
TRYP : 116 YLLRFLGQNLSVRTGRYRSILQLVKPWYDEVKDYAFPYPQDCNPRCPMRCFGPMCTHYTQ 175 

NOV4b: 165 IVWATTNKIGCAVNTCRKMTVWGEWENAVYFVC^^5fSPKGl^IGEAPYKNGRPCSECPP 224 
20 +VWAT+N+IGCA++TC+ M VWG VW AVY VCNY+PKGNWIGEAPYK G PCS CPPS 

TRYP : 176 NTVWATSNRIGCAIHTCQNMNVWGSVWRRAVYLVCNYAPKGNWIGEAPYKVGVPCSSCPPS 235 

NOV4b: 225 YGGSCRNNLCY 235 (SEQ ID NO: 74) 
YGGSC +NLC+ 

25 TRYP : 236 YGGSCTDNLCF . 246 (SEQ ID NO; 42) 

Furthermore, a PROSITE database search of protein families and domains confirmed 
that NOV-4a is a member of the trypsin inhibitor family. One of the conserved regions found 
30 in trypsin inhibitors is a SCP domain, located at the C-temiinal half. The pattern of this 
conserved domain is: [LIVMFYH]-[LIVMFY]-x-C-[NQRHS]-Y-x-[PAim]-^^ 
[LIVMFYWDN] (SEQ ID NO: 56). This pattern is found in amino acids 195-206 of SEQ ID 
NO: 17. 

SignalPep and PSORT analyses indicate that that NOV-4b is likely located outside of 
35 the cell (certainty = 0.6950), and is likely to have a cleavable N-temiinal signal sequence with 
a cleavage site between positions 22 and 23: SQG-YL. The predicted molecular weight of 
NOV.4b is 55928.2 daltons. 

Based on its relatedness to known members of the trypsin inhibitor family of proteins, 
NOV4b provides new diagnostic and therapeutic compositions useful in the treatment of 
• 40 disorders associated with alterations in the expression of members of the trypsin inhibitor 

protein family. Nucleic acids, polypeptides, antibodies, and other compositions of the present 
invention are useful in the treatment and diagnosis of a variety of diseases and pathologies, 
including, by way of nonlimiting example, those involving reproductive disorders, 
immunological disorders, cancer, and metabolic disorders. 
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NOV-4C 

A N0V-4c sequence according to the invention is a nucleic acid sequence that encodes 
a polypeptide related to trypsin inhibitor proteins. A disclosed NOV-4c nucleic acid and its 
5 encoded polypeptide are included in Table 29. The disclosed nucleic acid (SEQ ID NO: 18) is 
1669 nucleotides in length and contains an open reading frame (ORF) that begins with an 
ATG initiation codon at nucleotide 205, and ends with a TAG stop codon at nucleotide 1636. 
The representative ORF encodes a 205 amino acid polypeptide (SEQ ID NO: 19). 

10 TABLE 29 

TCTGACTGCTCCTATTGAGCTGTCTGCTCGCTGTGCCCGCTGTGCCTGCTGTGCCCGCGC 

TGTCGCCGCTGCTACCGCGTCTACTGGACGCGGGAGACGCCAGCGAGCTGGTGATTGGAG ^ 

CCCTGCGGAGAGCTCAAGCGCCCAGCTCTGCCCGAGGAGCCCAGGCTGCCCCGTGAGTCC 

CATAGTTGCTGCAGGAGTGGAGCCATGAGCTGCGTCCTGGGTGGTGTCATCCCCTTGGGG 

15 

CTGCTGTTCCTGGTCTGCGGATCCCAAGGCTACCTCCTGCCCAACGTCACTCTCTTAGAG 

GAGCTGCTCAGCAAATACCAGCACAACGAGTCTCACTCCCGGGTCCGCAGAGCCATCCCC 

AGGGAGGACAAGGAGGAGATCCTCATGCTGCACAACAAGCTTCGGGGCCAGGTGCAGCCT 

CAGGCCTCCAACATGGAGTACATGACCTGGGATGACGAACTGGAGAAGTCTGCTGCAGCG 

TGGGCCAGTCAGTGCATCTGGGAGCACGGGCCCACCGGTCTGCTGGTGTCCATCGGGCAG 
20 AACCTGGGCGCTCACTGGGGCAGGTATCGCTCTCCGGGGTTCCATGTGCAGTCCTGGTAT 

GACGAGGTGAAGGACTACACCTACCCCTACCCGAGCGAGTGCAACCCCTGGTGTCCAGAG 

AGGTGCTCGGGGCCTATGTGCACGCACTACACACAGATAGTTTGGGCCACCACCAACAAG 

ATCGGTTGTGCTGTGAACACCTGCCGGAAGATGACTGTCTGGGGAGAAGTTTGGGAGAAC 

GCGGTCTACTTTGTCTGCAATTATTCTCCAAAGGGGAACTGGATTGGAGAAGCCCCCTAC 
25 AAGAATGGCCGGCCCTGCTCTCAGTGCCCACCCAGCTATGGAGGCAGCTGCAGGAACAAC 

TTGTGTTACCGAGAAGAAACCTACACTCCAAAACCTGAAACGGACGAGATGAATGAGGTG 

GAAACGGCTCCCATTCCTGAAGAAAACCATGTTTGGCTCCAACCGAGGGTGATGAGACCC 

ACCAAGCCCAAGAAAACCTCTTCGGTCAACTACATGACCCAAGTCGTCTTATGTGACACC 

AAGATGAAGGACAGGTGCAAAGGGTCCACGTGTAACAGGtACCAGTGCCCAGCAGGCTGC 
30 CTGAACCACAAGGCGAAGATCTTTGGAACTCTGTTCTATGAAAGCTCGTCTAGCATATGC 

CGCGCCGCCATCCACTACGGGATCCTGGATGACAAGGGAGGCCTGGTGGATATCACCAGG 

AACGGGAAGGTCCCCTTCTTCGTGAAGTCTGAGAGACACGGCGTGCAGTCCCTCAGCAAA 

TACAAACCTTCCAGCTCATTCATGGTGTCAAAAGTGAAAGTGCAGGATTTGGACTGCTAC 

ACGACCGTTGCTCAGCTGTGCCCGTTTGAAAAGCCAGCAACTCACTGCCCAAGAATCCAT 
35 TGTCCGGCACACTGCAAAGACGAACCTTCCTACTGGGCTCeGGTGTTTGGAACCAACATC 

TATGCAGATACCTCAAGCATCTGCAAGACAGCCGTGCACGCGGGAGTCATCAGCAACGAG 

AGTGGGGGTGACGTGGACGTGATGCCCGTGGATAAAAAGAAGACCTACACCTGCCCGGCA 

GCCGCTCGAGCCCTATAGTGTAAACCGATTCGCAGCACACTGGCGCCGT (SEQ ID 

NO: 18) 

40 

MSCVLGGVIPLGLLFLVCGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAIP 
REDKEEILMLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPT 
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GLLVSIGQNLGAHWGRYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSG 
PMCTHYTQIVWATTNKIGCAVNTCRKMTVWGEVWENAVYFVCNYSPKGNWIG 
EAPYKNGRPCSQCPPSYGGSCRNNLCYREETYTPKPETDEMNEVETAPIPEE 
NHVWLQPRVMRPTKPKKTSSVNYMTQWLCDTEOIKDRCKGSTCNRYQCPAGC 
5 LNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGKVPFFVKSER 
HGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAH 
CKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTY 

TCPAAARAL (SEQ ID NO: 19) 



10 The disclosed NOV-4c amino acid sequence has a high level of homology (97% 

identity, 97% similarity) to a human trypsin inhibitor-like protein (GenBank Accession No: 
CAB66795), shown in Table 30. As indicated by the "Expect" value, the probability of this 
alignment occurring by chance alone is 0.0, the lowest probability score. 



15 TABLE 30 

Score = 948 bits (2452), Expect =0.0 

Identities = 458/468 (97%), Positives = 460/468 (97%) 

N0V4C : 1 MSCVLGGVIPLGLLFLVCGSQGYLLPNVTXXXXXXSKYQHNESHSRVRRAIPREDKEEIL 60 
20 MSCVLGGVIPLGLLFLVCGSQGYLLPNVT SKYQHNESHSRVRRAIPREDKEEIL 

TRYP : 1 MSCVLGGVIPLGLLFLVCGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAIPREDKEEIL 60 

NOV4C : 61 MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTGLLVSIGQNLGAHWGR 120 
MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPT LLVSIGQNLGAHWGR 
25 TRYP : 61 MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTSLLVSIGQNLGAHWGR 120 

N0V4c: 121 YRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCAVNTC 180 

YRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCAVNTC 
TRYP : 121 YRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCAVNTC 180 

30 

N0V4c: 181 RKMTVWGEVWENAVYFVCNYSPKGNWIGEAPYKNGRPCSQCPPSYGGSCRNNLCYREETY 240 

RKMTVWGEVWENAVYFVCNYSPKGN^JIGEAPYBCNGRPCS+CPPSYGGSCRNNLCYREETY 
TRYP : 181 RKI'4TVWGEWE^N^AVYFVCNYSPKG^n''?IGEAPYK^TGRPCSECPPSYGGSCRNNLCYR^^ 240 

35 N0V4c: 241 TPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSSVNYMTQWLCDTKMKDRCKG 300 
TPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTS+VNYMTQW CDTKMKDRCKG 
TRYP : 241 TPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQWRCDTKMKDRCKG 300 

N0V4c: 301 STCNRYQCPAGCLNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGKVPFFV 360 
40 STCNRYQCPAGCLNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGKVPFFV 

TRYP : 301 STCNRYQCPAGCLNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGKVPFFV 360 

N0V4c: 361 KSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAHCKDE 420 
KSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAHCKDE 
45 TRYP : 361 KSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAHCKDE 4 20 

N0V4c: 421 PSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTY 4 68 (SEQ ID NO: 
75) 

PSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTY 
50 TRYP : 421 PSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDIOCKTY 4 68 (SEQ ID NO: 
43) 
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Furthermore, a PROSITE database search of protein families and domains confirmed that 
NOV-4c is a member of the trypsin inhibitor family. One of the conserved regions found in 
trypsin inhibitors is a SCP domain, located at the C-terminal half The pattern of this 
conserved domain is: [LIVMFYH]-[LIVMFY]-x-C-[NQimS]-Y-x-|TARH^ 
[LIVMFYWDN] (SEQ ID NO: 56). This pattem is found in amino acids 81-92 of SEQ ID 
NO: 19. 

In addition, SignalPep and PSORT analyses indicate that NOV-4c is likely located 
outside of the cell (certainty = 0.8200), and is likely to have a cleavable N-terminal signal 
sequence with a cleavage site between positions 22 and 23: SQG-YL. The predicted 
molecular weight of NOV-4c is 53587.7 daltons. 

Based on the relatedness between N0V-4c and the conserved trypsin inhibitor 
proteins, the NOV-4c protein is a novel member of the trypsin inhibitor family. N0V-4d 
provides new diagnostic and therapeutic compositions useful in the treatment of disorders 
associated with alterations in the expression of members of the trypsin inhibitor protein 
family. Nucleic acids, polypeptides, antibodies, and other compositions of the present 
invention are useful in the treatment and diagnosis of a variety of diseases and pathologies, 
including, by way of nonlimiting example, those involving reproductive disorders, 
immimological disorders, cancer, and metabolic disorders. 

NOV-4d 

A N0V-4d sequence according to the invention is a nucleic acid sequence that encodes 
a polypeptide related to trypsin inhibitor proteins. A disclosed NOV-4d nucleic acid and its 
encoded polypeptide are included in Table 31. The disclosed nucleic acid (SEQ ID NO: 20) is 
2403 nucleotides in length and contains an open reading fi-ame (ORF) that begins with an 
ATG initiation codon at nucleotide 206, and ends with a TGA stop codon at nucleotide 1700. 
A disclosed, representative ORF encodes a 498 amino acid polypeptide (SEQ ID NO: 21). 

TABLE 31 

CTCTGACTGCTCCTATTGAGCTGTCTGCTCGCTGTGCCCGCTGTGCCTGCTGTGCCCGCGCTGTCGCCGCTGCTACCGCG 
TCTACTGGACGCGGGAGACGCCAGCGAGCTGGTGATTGGAGCCCTGCGGAGAGCTCAAGCGCCCAGCTCTGCCCGAGGAG 
CCCAGGCTGCCCCGTGAGTCCCATAGTTGCTGCAGGAGTGGAGCCATGAGCTGCGTCCTGGGTGGTGTCATCCCCTTGGG 
GCTGCTGTTCCTGGTCCGCGGATCCCAAGGCTACCTCCTGCCCAACGTCACTCTCTTAGAGGAGCTGCTCAGCAAATACC 
AGCACAACGAGTCTCACTCCCGGGTCCGCAGAGCCATCCCCAGGGAGGACAAGGAGGAGATCCTCATGCTGCACAACAAG 
CTTCGGGGCCAGGTGCAGCCTCAGGCCTCCAACATGGAGTACATGACCTGGGATGACGAACTGGAGAAGTCTGCTGCAGC 
GTGGGCCAGTCAGTGCATCTGGGAGCACGGGCCCACCAGTCTGCTGGTGTCCATCGGGCAGAACCTGGGCGCTCACTGGG 
GCAGGAGGTATCGCTCTCCGGGGTTCCATGTGCAGTCCTGGTATGACGAGGTGAAGGACTACACCTACCCCTACCCGAGC 
GAGTGCAACCCCTGGTGTCCAGAGAGGTGCTCGGGGCCTATGTGCACGCACTACACACAGATAGTTTGGGCCACCACCAA 
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CAAGATCGGTTGTGCTGTGAACACCTGCCGGAAGATGACTGTCTGGGGAGAAGTTTGGGAGAACGCGGTCTACTTTGTCT 
GCAATTATTCTCCAAAGGGGAACTGGATTGGAGAAGCCCCCTACAAGAATGGCCGGCCCTGCTCTGAGTGCCCACCCAGC 
TATGGAGGCAGCTGCAGGAACAACTTGTGTTACCGAGAAGAAACCTACACTCCAAAACCTGAAACGGACGAGATGAATGA 
GGTGGAAACGGCTCCCATTCCTGAAGAAAACCATGTTTGGCTCCAACCGAGGGTGATGAGACCCACCAAGCCCAAGAAAA 
5 CCTCTGCGGTCAACTACATGACCCAAGTCGTCAGATGTGACACCAAGATGAAGGACAGGTGC/^AAGGGTCCACGTGTAAC 
AGGTACCAGTGCCCAGCAGGCTGCCTGAACCAGAAGGCGAAGATCTTTGGAAGTCTGTTCTATGAAAGCTCGTCTAGCAT 
ATGCCGCGCCGCCATCCACTACGGGATCCTGGATGACAAGGGAGGCCTGGTGGATATCACCAGGAACGGGAAGGTCCCCT 
TCTTCGTGAAGTCTGAGAGACACGGCGTGCAGTCCCTCAGCAAATACAAACCTTCCAGCTCATTCATGGTGTCAAAAGTG 
AAAGTGCAGGATTTGGACTGCTACACGACCGTTGCTCAGCTGTGCCCGTTTGAAAAGCCAGCAACTCACTGCCCAAGAAT 

10 CCATTGTCCGGCACACTGCAAAGACGAACCTTCCTACTGGGCTCCGGTGTTTGGAACCAACATCTATGCAGATACCTCAA 
GCATCTGCAAGACAGCCGTGCACGCGGGAGTCATCAGCAACGAGAGTGGGGGTGACGTGGACGTGATGCCCGTGGATAAA 
AAGTy^GACCTACGTGGGCTCGCTCAGGAATGGAGTTCAGTCTGAAAGCCTGGGGACTCCTCGGGATGGAAAGGCCTTCCG 
GATCTTTGCTGTCAGGCAGTGAATTTCCAGCACCAGGGGAGAAGGGGCGTCTTCAGGAGGGCTTCGGGGTTTTGCTTTTA 
TTTTTATTTTGTCA.TTGCGGGGTATATGGAGAGTCAGGAAACTTCCTTTGACTGATGTTCAGTGTCCATCACTTTGTGGC 

15 CTGTGGGTGAGGTGACATCTCATCCCCTCACTGAAGCAACAGCATCCCAAGGTGCTCAGCCGGACTCCCTGGTGCCTGAT 
CCTGCTGGGGCCCGGGGGTCTCCATCTGGACGTCCTCTCTCCTTTAGAGATCTGAGCTGTCTCTTAAAGGGGACAGTTGC 

ccaaaatgttccttgctatgtgttcttctgttggtggaggaagttgatttcaacctccctgccaaaagaacaaaccattt 
gaagctcacaattgtgaagcattcacggcgtcggaagaggccttttgagcaagcgccaatgagtttcaggaatgaagtag 

AAGGTAGTTATTTAAAAATAAAAAACACAGTCCGTCCCTACCAATAGAGGAAAATGGTTTTAATGTTTGCTGGTCAGACA 
20 GACAAATGGGCTAGAGTAAGAGGGCTGCGGGTATGAGAGACCCCGGCTCCGCCCTGGCACGTGTCCTTGCTGGCGGCCCG 
CCACAGGCCCCCTTCAATGGCCGCATTCAGGATGGCTCTATACACAGCAGTGCTGGTTTATGTAAAGTTCAGCAGTCACT 

TCA (SEQIDNO:20) 

MSCVLGGVIPLGLLFLVRGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAIPREDKEEILMLHNKLRGQV 
•25 QPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTSLLVSIGQNLGAHWGRRYRSPGFHVQSWYDEVKDYT 
YPYPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCAVNTCRKMTVWGEVWENAVYFVCNYSPKGNWIGE 
APYKNGRPCSECPPSYGGSCRNNLCYREETYTPKPETDEMNEVETAPiPEENHVWLQPRVMRPTKPKKTS 
AVNYMTQVVRCDTKMKDRCKGSTCNRYQCPAGCLNHKAKIFGSLFYESSSSICRAAIHYGILDDKGGLVD 
ITRNGKVPFFVKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAHCKD 
30 EPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNGVQSESLGTPRDGKA 
FRIFAVRQ (SEQ ID NO: 21) - 

The disclosed NOV-4d amino acid sequence has a high level of homology (98% 
identity, 98% similarity) to a hnman trypsin inhibitor-like protein (GenBank Accession No: 
35 CAB66795), as shown in Table 32. As indicated by the "Expect" value, the probability of this 
alignment occurring by chance alone is 0.0, the lov^est probabihty score. 



TABLE 32 

Score = 1007 bits (2605), Expect = 0.0 
40 Identities = 489/498 (98%), Positives = 490/498 (98%), Gaps = 1/498 (0%) 

N0V4d : 1 MSCVLGGVIPLGLLFLVRGSQGYLLPNVTXXXXXXSKYQHNESHSRVRRAIPREDKEEIL 60 

MSCVLGGVIPLGLLFLV GSQGYLLPNVT SKYQHNESHSRVRRAIPREDKEEIL 
TRYP : 1 MSCVLGGVIPLGLLFLVCGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAIPREDKEEIL 60 
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N0V4d : 61 MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTSLLVSIGQNLGAHWGR 120 

MLHNKLRGQVQPQASNMEYMTWDDELEKST^WASQCIWEHGPTSLLVSIGQNLGAHWG 
TRYP : 61 MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTSLLVSIGQNLGAHWG- 119 

N0V4d: 121 RYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCAVNT 180 

RYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCAVNT 
TRYP : 120 RYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQIVWATTNKIGCAVNT 17 9 

10 N0V4d: 181 CRKMTVWGEVWENAVYFVCNYSPKGNWIGEAPYKNGRPCSECPPSYGGSCRNNLCYREET 240 
CRKMTVWGEVWENAVYFVCNYSPKGNWIGEAPYKNGRPCSECPPSYGGSCRNNLCYREET 
TRYP : 180 CRKMTVWGEVWENAVYFVCNYSPKGNWIGEAPYKNGRPCSECPPSYGGSCRNNLCYREET 239 

N0V4d: 241 YTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQWRCDTKMKDRCK 300 
15 YTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQWRCDTKMKDRCK 

TRYP : 240 YTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQWRCDTKMKDRCK 299 

N0V4d: 301 GSTCNRYQCPAGCLNHBCAKIFGSLFYESSSSICRAAIHYGILDDKGGLVDITRNGKVPFF 360 
GSTCNRYQCPAGCLNHKAKIFG+LFYESSSSICRAAIHYGILDDKGGLVDITRNGKVPFF 
20 TRYP : 300 GSTCNRYQCPAGCLNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGKVPFF 359 

N0V4d: 361 VKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAHCKD 420 

VKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAHCKD 
TRYP : 360 VKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAHCKD 419 

25 

N0V4d: 421 EPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNGVQSE 480 

EPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNGVQSE 
TRYP : 420 EPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNGVQSE 479 

30 . N0V4d: 481 SLGTPRDGKAFRIFAVRQ 4 98 (SEQ ID NO: 7 6) 
SLGTPRDGKAFRIFAVRQ 
TRYP : 480 SLGTPRDGBCAFRIFAVRQ 497 (SEQ ID NO: 44) 

A PROSITE database search of protein families and domains confirmed that a NOV-4c 
35 polypeptide is a member of the trypsin inhibitor family. One of the conserved regions found 
in trypsin inhibitors is a SCP domain, located at the C-terminal half. The pattern of this 
conserved domam is: [LIVMFYH]-[LIVMFY]-x-C-[NQRHS]-Y-x-[P^^^ 
[LrVMFYWDN] (SEQ ID NO: 56). This pattern is found m amino acids 196-207 of SEQ ID 
NO: 21. 

40 Based on the relatedness between NOV-4d and the conserved trypsin inhibitor 

proteins, NOV-4d is a novel member of the trypsin inhibitor family. NOV-4d provides new 
diagnostic and therapeutic compositions useful in the treatment of disorders associated with 
alterations in the expression of members of the trypsin inhibitor protein family. Nucleic acids, 
polypeptides, antibodies, and other compositions of the present invention are useful in the 

45 treatment and diagnosis of a variety of diseases and pathologies, including, by way of 
nonlimiting example, those involving reproductive disorders, immimological disorders, 
cancer, and metabolic disorders. 

In addition, SignalPep and PSORT analyses indicate that that NOV-4d is likely located 
outside of the cell (certainty = 0.6950), and is likely to have a cleavable N-terminal signal 
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molecular weight of NOV-4b is 561 14.4 daltons. 
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The predicted 



5 NOV-4e 

A N0V-4e sequence according to the invention is a nucleic acid sequence that encodes 
a polypeptide related to trypsm inhibitor proteins. A disclosed NOV-4e nucleic acid and its 
encoded polypeptide are included in Table 33. The disclosed nucleic acid (SEQ ID NO: 22) is 
2412 nucleotides in length and contains an open reading frame (ORF) that begins with an 
10 ATG initiation codon at nucleotide 206, and ends with a TGA stop codon at nucleotide 1709. 
A disclosed, representative ORF encodes a 501 amino acid polypeptide (SEQ ID NO: 23). 

TABLE 33 

CTCTGACTGCTCCTATTGAGCTGTCTGCTCGCTGTGCCCGCTGTGCCTGCTGTGCCCG 

15 CGCTGTCGCCGCTGCTACCGCGTCTACTGGACGCGGGAGACGCCAGCGAGCTGGTGAT 
TGGAGCCCTGCGGAGAGCTCAAGCGCCCAGCTCTGCCCGAGGAGCCCAGGCTGCCCCG 
TGAGTCCCATAGTTGCTGCAGGAGTGGAGCCATGAGCTGCGTCCTGGGTGGTGTCATC 
CCCTTGGGGCTGCTGTTCCTGGTCCGCGGATCCCAAGGCTACCTCCTGCCCAACGTCA 
CTCTCTTAGAGGAGCTGCTCAGCAAATACCAGCACAACGAGTCTCACTCCCGGGTCCG 

20 CAGAGCCATCCCCAGGGAGGACAAGGAGGAGATCCTCATGCTGCACAACAAGCTTCGG 
GGCCAGGTGCAGCCTCAGGCCTCCAACATGGAGTACATGACCTGGGATGACGAACTGG 
AGAAGTCTGCTGCAGCGTGGGCCAGTCAGTGCATCTGGGAGCACGGGCCCACCGGTCT 
GCTGGTGTCCATCGGGCAGAACCTGGGCGCTCACTGGGGCAGGTATCGCTCTCCGGGG 
TTCCATGTGCAGTCCTGGTATGACGAGGTGAAGGACTACACCTACCCCTACCCGAGCG 

25 AGTGCAACCCCTGGTGTCCAGAGAGGTGCTCGGGGCCCATGTGCACGCACTACACACA 
GGTAACTCAGATAGTTTGGGCCACCACCAACAAGATCGGTTGTGCTGTGAACACCTGC 
CGGAAGATGACTGTCTGGGGAGAAGTTTGGGAGAACGCGGTCTACTTTGTCTGCAATT 
ATTCTCCAAAGAGGGGGAACTGGATTGGAGAAGCCCCCTACAAGAATGGCCGGCCCTG 
CTCTGAGTGCCCACCCAGCTATGGAGGCAGCTGCAGGAACAACTTGTGTTACCGAGAA 

30 GAAACCTACACTCCAAAACCTGAAACGGACGAGATGAATGAGGTGGAAACGGCTCCCA 
TTCCTGAAGAAAACCATGTTTGGCTCCAACCGAGGGTGATGAGACCCACCAAGCCCAA 
GAAAACCTCTGCGGTCAACTACATGACCCAAGTCGTCAGATGTGACACCAAGATGAAG 
GACAGGTGCAAAGGGTCCACGTGTAACAGGTACCAGTGCCCAGCAGGCTGCCTGAACC 
ACAAGGCGAAGATCTTTGGAAGTCTGTTCTATGAAAGCTCGTCTAGCATATGCCGCGC 

35 CGCCATCCACTACGGGATCCTGGATGACAAGGGAGGCCTGGTGGATATCACCAGGAAC 
GGGAAGGTCCCCTTCTTCGTGAAGTCTGAGAGACACGGCGTGCAGTCCCTCAGCAAAT 
ACAAACCTTCCAGCTCATTCATGGTGTCAAAAGTGAT^AGTGCAGGATTTGGACTGCTA 
CACGACCGTTGCTCAGCTGTGCCCGTTTGAAAAGCCAGCAACTCACTGCCCT^AGAATC 
CATTGTCCGGCACACTGCAAAGACGAACCTTCCTACTGGGCTCCGGTGTTTGGTiACCA 
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ACATCTATGCAGATACCTCAAGCATCTGCAAGACAGCCGTGCACGCGGGAGTCATCAG 
CAACGAGAGTGGGGGTGACGTGGACGTGATGCCCGTGGATAAAAAGAAGACCTACGTG 
GGCTCGCTCAGGAATGGAGTTCAGTCTGAAAGCCTGGGGACTCCTCGGGATGGAAAGG 
CCTTCCGGATCTTTGCTGTCAGGCAGTGAATTTCCAGCACCAGGGGAGAAGGGGCGTC 
5 TTCAGGAGGGCTTCGGGGTTTTGCTTTTATTTTTATTTTGTCATTGCGGGGTATATGG 

AGAGTCAGGAAACTTCCTTTGACTGATGTTCAGTGTCCATCACTTTGTGGCCTGTGGG • 
TGAGGTGACATCTCATCCCCTCACTGAAGCAACAGCATCCCAAGGTGCTCAGCCGGAC 
TCCCTGGTGCCTGATCCTGCTGGGGCCCGGGGGTCTCCATCTGGACGTCCTCTCTCCT 
TTAGAGATCTGAGCTGTCTCTTAAAGGGGACAGTTGCCCAAAATGTTCCTTGCTATGT 

10 GTTCTTCTGTTGGTGGAGGAAGTTGATTTCAACCTCCCTGCCAAAAGAACAAACCATT 
TGAAGCTCACAATTGTGAAGCATTCACGGCGTCGGAAGAGGCCTTTTGAGCAAGCGCC 
AATGAGTTTCAGGAATGAAGTAGAAGGTAGTTATTTAAAAATAAAAAACACAGTCCGT 
CCCTACCAATAGAGGAAAATGGTTTTAATGTTTGCTGGTCAGACAGACAAATGGGCTA 
GAGTAAGAGGGCTGCGGGTATGAGAGACCCCGGCTCCGCCCTGGCACGTGTCCTTGCT ^ 

15 GGCGGCCCGCCACAGGCCCCCTTCAATGGCCGCATTCAGGATGGCTCTATACACAGCA 
GTGCTGGTTTATGTAAAGTTCAGCAGTCACTTCA (SEQ ID NO: 22) 

MSCVLGGVI PLGLLFLVRGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAI PREDKEE 
ILMLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTGLLVSIGQNLGA 

20 HWGRYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQVTQIVWATTN 
KIGCAVNTCRKMTVWGEVWENAVYFVCNYSPKRGNWIGEAPYKNGRPCSECPPSYGGS 
CRNNLCYREETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQ 
VVRCDTKMKDRCKGSTCNRYQCPAGCLNHKAKIFGSLFYESSSSICRAAIHYGILDDK 
GGLVDITRNGKVPFFVKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFE 

25 KPATHCPRIHCPAHCKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVM 
PVDKKKTYVGSLRNGVQSESLGTPRDGKAFRIFAVRQ (SEQ ID NO: 23) 

The disclosed NOV-4e amino acid sequence has a high level of homology (97% 
identity, 97% similarity) to a human trypsin inhibitor-like protein (GenBank Accession No: 
30 CAB66795), shown in Table 34. As indicated by the "Expect" value, the probabihty of this 
alignment occurring by chance alone is 0.0, the lowest probability score, 

TABLE 34 

Score = 1001 bits (2588) , Expect = 0.0 
35 Identities = 488/501 (97%), Positives = 489/501 (97%), Gaps = 4/501 (0%) 

N0V4e: 1 MSCVLGGVIPLGLLFLVRGSQGYLLPNVTXXXXXXSKYQHNESHSRVRRAIPREDKEEIL 60 

MSCVLGGVI PLGLLFLV GSQGYLLPNVT SKYQHNESHSRVRRAI PREDKEE I L 

TRYP : 1 MSCVLGGVI PLGLLFLVCGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAIPREDKEEIL 60 
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N0V4e: 61 MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTGLLVSIGQNLGAHWGR 120 

MLHNECLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPT LLVSIGQNLGAHWGR 
TRYP : 61 MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTSLLVSIGQNLGAHWGR 120 
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N0V4e: 121 YRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQVTQIVWATTNKIGCAV 180 

YRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHY TQIVWATTNKIGCAV 
TRYP : 121 YRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHY TQIVWATTNKIGCAV 177 

5 

N0V4e: 181 NTCRKMTVWGEVWENAVYFVCNYSPKRGNWIGEAPYKNGRPCSECPPSYGGSCRNNLCYR 240 

NTCRKMTVWGEVWENAVYFVCNYSPK GNWIGEAPYKNGRPCSECPPSYGGSCRNNLCYR 
TRYP : 178 NTCRKMTVWGEVWENAVYFVCNYSPK-GNWIGEAPYKNGRPCSECPPSYiSGSCRNNLCYR 236 

10 N0V4e: 241 EETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQWRCDTKMKD 300 
EETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQWRCDTKMKD 
TRYP : 237 EETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQWRCDTKMKD 296 

N0V4e: 301 RCKGSTCNRYQCPAGCLNHKAKIFGSLFYESSSSICRAAIHYGILDDKGGLVDITRNGKV 360 
15 RCKGSTCNRYQCPAGCLNHKAKIFG+LFYESSSSICRAAIHYGILDDKGGLVDITRNGKV 

TRYP : 297 RCKGSTCNRYQCPAGCLNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGKV 356 

NOV4e: 361 PFFVKSERHGVOSLSKYKPSSSFMVSECVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAH 420 
PFFVKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAH 
20 TRYP : 357 PFFVKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPAH 416 
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N0V4e: 421 CKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNGV 480 

CKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNGV 
TRYP : 417 CKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNGV 47 6 

N0V4e: 481 QSESLGTPRDGKAFRIFAVRQ 501 (SEQ ID NO: 77) 

QSESLGTPRDGKAFRIFAVRQ 
TRYP : 477 QSESLGTPRDGKAFRIFAVRQ 497 (SEQ ID NO: 45) 



30 In addition, SignalPep and PSORT analyses indicate that that NOV-4e is likely located 

outside of the cell (certainty = 0.6950), and is likely to have a cleavable N-terminal signal 
sequence with a cleavage site between positions 22 and 23: SQG-YL. The predicted 
molecular weight of NOV-4b is 56412.8 daltons. 

Based on the relatedness between NOV-4e and the conserved trypsin inhibitor 

35 proteins, the NOV-4e protein is a novel member of the trypsin inhibitor family. NOV-4e 
provides new diagnostic and therapeutic compositions useful in the treatment of disorders 
associated with alterations in the expression of members of the trypsin inhibitor protein 
family. Nucleic acids, polypeptides, antibodies, and other compositions of the present 
invention are useful in the treatment and diagnosis of a variety of diseases and pathologies, 

40 including, by way of nonlimiting example, those involving reproductive disorders, 
immunological disorders, cancer, and metabolic disorders. 

Table 35 shows a sequence aUgnment between the NOV-4 polypeptides according to 
the invention and a human trypsin inhibitor-like protein (GenBank Accession No: 
CAB66795), indicating the homology between the present invention and the trypsin inhibitor 

45 family. Moreover, the PROSITE conserved SCP region found in trypsin inhibitors is found in 
sequences 151-162 of the trypsin inihibitor-like protein shown (shown in bold font). 
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TABLE 35 

NOV4e MSCVLGGVIPLGLLFLVRGSQGYLLPNVTLLEELLSBCYQHNESHSRVRRAIPREDKEEIL 

NOV4a 

5 NOV4b MSCVLGGVIPLGLLFLVRGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAIPREDKEEIL 

N0V4d MSCVLGGVIPLGLLFLVRGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAIPREDKEEIL 

NOV4C MSCVLGGVIPLGLLFLVCGSQGYLLPNVTLLEELLSKYQHNESHSRVRRAIPREDKEEIL 

TRYP ARRKRYISQNDMIAIL 

10 

N0V4e MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTGLLVSIGQNLGAHWG- 

NOV4a MTNWG- 

NOV4b MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTGLLVSIGQNLGAHWG- 

NOV4d MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTSLLVSIGQNLGAHWGR 

1 5 NOV4 c MLHNKLRGQVQPQASNMEYMTWDDELEKSAAAWASQCIWEHGPTGLLVSIGQNLGAHWG- 

TRYP DYHNQVRGKVFPPAANMEYMVWDENLAKSAEAWAATCIWDHGPSYLLRFLGQNLSVRTG- 
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N0V4e 
NOV4a 
NOV4b 
N0V4d 
N0V4C 
TRYP 



RYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHYTQVTQIVWATTNKIGCA 

RYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHY TQIVWATTNKIGCA 

RYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHY TQIVWATTNKIGCA 

RYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHY TQIVWATTNKIGCA 

RYRSPGFHVQSWYDEVKDYTYPYPSECNPWCPERCSGPMCTHY TQIVWATTNKIGCA 

RYRSILQLVKPWYDEVKDYAFPYPQDCNPRCPMRCFGPMCTHY TQMVWATSNRIGCA 

*. ^r-kick-kifkic . . *** k* kk kkkkkkk kk • kkkk » k » kkkk 
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NOV4e 
N0V4a 
N0V4b 
N0V4d 
NOV4C 
TRYP 



VNTCRKMTVWGEVWENAVYFVCNYSPKRGNWIGEAPYKNGRPCSECPPSYGGSCRNNLCY 
VNTCRKMTVWGEVWENAVYFVCNYSPK-GNWIGEAPYKNGRPCSECPPSYGGSCRNNLCY 
VNTCRKMTVWGEVWENAVYFVCNYSPK-GNWIGEAPYKNGRPCSECPPSYGGSCRNNLCY 
VNTCRKMTVWGEVWENAVYFVCNYSPK-GNWIGEAPYKNGRPCSECPPSYGGSCRNNLCY 
VNTCRKMTVWGEVWENAVYFVCNYSPK-GNWIGEAPYKNGRPCSQCPPSYGGSCRNNLCY 
IHTCQNMNVWGSVWRRAVYLVCNYAPK-GNWIGEAPYKVGVPCSSCPPSYGGSCTDNLCF 

*** kk kkk.kkkk*kk kkkkkkkkkk ★ * * * kkkkkkkkk 
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N0V4e 
NOV 4 a 
N0V4b 
N0V4d 
N0V4C 
TRYP 



REETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQVVRCDTKMK 
REETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQWRCDTKMK 
REETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQWRCDTKMK 
REETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSAVNYMTQWRCDTKMK 
REETYTPKPETDEMNEVETAPIPEENHVWLQPRVMRPTKPKKTSSVNYMTQWLCDTKMK 
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NOV4e 
NOV4a 
N0V4b 
N0V4d 
NOV4C 
TRYP 



NOV4e 
N0V4a 
N0V4b 
NOV4d 
NOV 4 c 
TRYP 



DRCKGSTCNRYQCPAGCLNHKAKIFGSLFYESSSSICRAAIHYGILDDKGGLVDITRNGK 
DRCKGSTCNRYQCPAGCLNHKAKIFGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGK 
DRCKGSTCNRYQCPAGCLNHKAKI FGSLFYESSSSICRAAIHYGILDDKGGLVDITRNGK 
DRCKGSTCNRYQCPAGCLNHKAKIFGSLFYESSSSICRAAIHYGILDDKGGLVDITRNGK 
DRCKGSTCNRYQCPAGCLNHKAKI FGTLFYESSSSICRAAIHYGILDDKGGLVDITRNGK 



VPFFVKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPA 
VPFFVKSERHGVQSLSKYKPSSSFMVSECVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPA 
VPFFVKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPA 
VPFFVKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPA 
VPFFVKSERHGVQSLSKYKPSSSFMVSKVKVQDLDCYTTVAQLCPFEKPATHCPRIHCPA 



NOV4e HCKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNG 

60 NOV4a HCKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNG 

N0V4b HCKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNG 

NOV4d HCKDEPS YWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYVGSLRNG 

93 



BNSOOCID: <WO_Ot62928A2J_> 



AVO 01/62928 PCT/USO 1/061 51 

N0V4C HCKDEPSYWAPVFGTNIYADTSSICKTAVHAGVISNESGGDVDVMPVDKKKTYT 

TRYP 

5 N0V4e VQSESLGTPRDGKAFRIFAVRQ (SEQ ID NO: 23) 

N0V4a VQSESLGTPRDGKAFRIFAVRQ (SEQ ID NO: 15) 

N0V4b VQSESLGTPRDGKAFRIFAVRQ (SEQ ID NO: 17) 

N0V4d VQSESLGTPRDGKAFRIFAVRQ (SEQ ID NO: 21) 

N0V4C CPAAARAL (SEQ ID NO: 19) 

10 TRYP (SEQ ID NO: 46) 



Consensus key 
* - single, fully conserved residue 
: - conservation of strong groups 
15 . - conservation of weak groups 
- no consensus 

The expression pattern, and protein similarity information for NOV-4 suggests that the 
human trypsin inhibitor-like proteins described in tliis invention may function as a trypsin 

20 inhibitor. Therefore, the nucleic acid and protein of the invention are useful in potential 

therapeutic applications implicated, for example but not limited to, in allergies and infectious 
diseases, in cancer, in metabolic disorders like obesity, hypertension and diabetes, and other 
diseases and disorders. 

Homology to antigenic secreted and membrane proteins suggests that antibodies 

25 directed against the novel genes may be useful in treatment and prevention of allergic 
reactions and infectious diseases. Expression in pituitary and adrenal gland suggests 
therapeutic applications in metabolic disorders like obesity, hypertension and diabetes. 
Similarity to a brain tumor overexpressed trypsin inhibitor suggests that the splice variants of 
10093872 may be involved in the pathogenesis of these cancers. Hence it could be useful as a 

30 cancer diagnostic marker or as a target for small molecule trypsin inhibitors in cancer 
treatment. 

Potential therapeutic uses for the invention(s) include, for example, the following: (i) 
protein therapeutic, (ii) small molecule drug target, (iii) antibody target (therapeutic, 
diagnostic, dmg targeting/cytotoxic antibody), (iv) diagnostic and/or prognostic marker, (v) 
35 gene therapy (gene delivery/gene ablation), (vi) research tools, and (vii) tissue regeneration in 
vitro and in vivo (regeneration for all these tissues and cell types composing these tissues and 
cell types derived from these tissues). 

The nucleic acids and proteins of the invention are useful in potential therapeutic 
applications implicated in various diseases and disorders described below and/or other 
40 pathologies and disorders. For example, but not limited to, a cDNA encoding the himian 

trypsin inhibitor-like protein may be useful in gene therapy, and the human trypsin inhibitor- 
like protein may be useful when administered to a subject in need thereof. By way of non- 
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lirniting example^ the compositions of the present invention will have efiScacy for treatment of 
patients suffering from, for example, but not limited to, in allergies and infectious diseases, in 
caner, in metabolic disorders like obesity, hypertension and diabetes, and other diseases and 
disorders. The novel nucleic acid encoding the human trypsin inhibitor-like protein, and the 
5 human trypsin inhibitor-like protein of the invention, or jfragments thereof, may further be 
useful in diagnostic apphcations, wherein the presence or amount of the nucleic acid or the 
protein are to be assessed. These materials are further useful in the generation of antibodies 
that bind immunospecifically to the novel substances of the invention for use in therapeutic or 
diagnostic methods. 

10 

NOV-X Nucleic acids 

The nucleic acids of the invention include those that encode a NOV-X polypeptide or 
protein. As used herein, the terms polypeptide and protein are interchangeable. 

In some embodiments, a NOV-X nucleic acid encodes a mature NOV-X polypeptide. 

15 As used herein, a "mature" form of a polypeptide or protein described herein relates to the 
product of a naturally occurring polypeptide or precursor form or proprotein. The naturally 
occurring polypeptide, precursor or proprotein includes, by way of nonUmiting example, the 
full-length gene product, encoded by the corresponding gene. Alternatively, it may be defined 
as the polypeptide, precursor or proprotein encoded by an open reading frame described 

20 herein. The product "mature" form arises, again by way of nonlimiting example, as a result of 
one or more naturally occurring processing steps that may take place within the cell in which 
the gene product arises. Examples of such processing steps leading to a "mature" form of a 
polypeptide or protein include the cleavage of the N-terminal methionine residue encoded by 
the initiation codon of an open reading frame, or the proteolytic cleavage of a signal peptide or 

25 leader sequence. Thus a mature form arising from a precursor polypeptide or protein that has 
residues 1 to N, where residue 1 is the N-terminal methionine, would have residues 2 through 
N remaining after removal of the N-terminal methionine. Alternatively, a mature form arising 
from a precursor polypeptide or protein having residues 1 to N, in which an N-terminal signal 
sequence from residue 1 to residue M is cleaved, would have the residues from residue M+1 to 

30 residue N remaining. Further as used herein, a "mature" foim of a polypeptide or protein may 
arise from a step of post-translational modification other than a proteolytic cleavage event. 
Such additional processes include, by way of non-limiting example, glycosylation, 
myristoylation or phosphorylation. In general, a mature polypeptide or protein may result 
from the operation of only one of these processes, or a combination of any of them. 
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Among the NOV-X nucleic acids is the nucleic acid whose sequence is provided in 
SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57, or a fragment thereof. Additionally, 
the invention includes mutant or variant nucleic acids of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 
16, 18, 20, 22, or 57, or a fragment thereof, any of whose bases may be changed from the 
5 corresponding bases shown in SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57, while 
still encoding a protein that maintains at least one of its NOV-X-like activities and 
physiological functions (i.e., modulating angiogenesis, neuronal development). The invention 
ftirther includes the complement of the nucleic acid sequence of SEQ ID NO: 1, 3, 6, 8, 10, 
12, 14, 16, 18, 20, 22, or 57, including fragments, derivatives, analogs and homologs thereof. 

10 The invention additionally includes nucleic acids or nucleic acid fragments, or complements 
thereto, whose structures include chemical modifications. 

One aspect of the invention pertains to isolated nucleic acid molecules that encode 
NOV-X proteins or biologically active portions thereof. Also included are nucleic acid 
fragments sufficient for use as hybridization probes to identify NOV-X-encoding nucleic acids 

15 (e.g., NOV-X mRNA) and fragments for use as polymerase chain reaction (PGR) primers for 
the amplification or mutation of NOV-X nucleic acid molecules. As used herein, the term 
"nucleic acid molecule" is intended to include DNA molecules (e.g., cDNA or genomic 
DNA), RNA molecules (e.g., mRNA), analogs of the DNA or RNA generated using 
nucleotide analogs, and derivatives, fragments and homologs thereof. The nucleic acid 

20 molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA. 

*Trobes" refer to nucleic acid sequences of variable length, preferably between at least 
about 10 nucleotides (nt), 100 nt, or as many as about, e.g., 6,000 nt, depending on use. 
Probes are used in the detection of identical, similar, or complementary nucleic acid 
sequences* Longer length probes are usually obtained from a natural or recombinant source, 

25 are highly specific and much slower to hybridize than ohgomers. Probes may be single- or 
double-stranded and designed to have specificity in PGR, membrane-based hybridization 
technologies, or ELISA-like technologies. 

An "isolated" nucleic acid molecule is one that is separated from other nucleic acid 
molecules that are present in the natural source of the nucleic acid. Examples of isolated 

30 nucleic acid molecules include, but are not limited to, recombinant DNA molecules contained 
in a vector, recombinant DNA molecules maintained in a heterologous host cell, partially or 
substantially purified nucleic acid molecules, and S3aithetic DNA or RNA molecules. 
Preferably, an "isolated" nucleic acid is free of sequences which natxurally flank the nucleic 
acid (i.e., sequences located at the 5' and 3' ends of the nucleic acid) in the genomic DNA of 
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the organism from which the nucleic acid is derived. For example, in various embodiments, 
the isolated NOV-X nucleic acid molecule can contain less than about 50 kb, 25 kb, 5 kb, 4 
kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of nucleotide sequences which naturally flank the nucleic 
acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, 
5 an "isolated" nucleic acid molecule, such as a cDNA molecule, can be substantially free of 
other cellular material or culture medium when produced by recombinant techniques, or of 
chemical precursors or other chemicals when chemically synthesized. 

A nucleic acid molecule of the present invention, e.g., a nucleic acid molecule having 
the nucleotide sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57, or a 

10 complement of any of this nucleotide sequence, can be isolated using standard molecular 

biology techniques and the sequence information provided herein. Using all or a portion of 
the nucleic acid sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57, as a 
hybridization probe, NOV-X nucleic acid sequences can be isolated using standard 
hybridization and cloning techniques (e.g., as described in Sambrook et al., eds.. Molecular 

15 Cloning: A Laboratory Manual 2"^ Ed., Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY, 1989; and Ausubel, et al., eds.. Current Protocols IN Molecular 
Biology, John Wiley & Sons, New York, NY, 1993.) 

A nucleic acid of the invention can be amplified using cDNA, mRNA or alternatively, 
genonuc DNA, as a template and appropriate oligonucleotide primers according to standard 

20 PCR amplification techniques. The nucleic acid so amplified can be cloned into an 
appropriate vector and characterized by DNA sequence analysis. Furthemiore, 
oligonucleotides corresponding to NOV-X nucleotide sequences can be prepared by standard 
synthetic techniques, e.g., using an automated DNA synthesizer. 

As used herein, the term "oligonucleotide" refers to a series of linked nucleotide 

25 residues, v^hich oligonucleotide has a sufficient number of nucleotide bases to be used in a 
PCR reaction. A short oligonucleotide sequence may be based on, or designed fi-om, a 
genomic or cDNA sequence and is used to amplify, confirm, or reveal the presence of an 
identical, similar or complementary DNA or RNA in a particular cell or tissue. 
Oligonucleotides comprise portions of a nucleic acid sequence having about 10 nt, 50 nt, or 

30 100 nt in length, preferably about 15 nt to 30 nt in length. In one embodiment, an 

oligonucleotide comprising a nucleic acid molecule less than 100 nt in length would fiirther 
comprise at lease 6 contiguous nucleotides of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 
22, or 57, or a complement thereof. Oligonucleotides may be chemically synthesized and may 
be used as probes. 
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In another embodiment, an isolated nucleic acid molecule of the invention comprises a 
nucleic acid molecule that is a complement of the nucleotide sequence shown in SEQ ID NO: 
1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57, or a portion of this nucleotide sequence. A nucleic 
acid molecule that is complementary to the nucleotide sequence shown in SEQ ID NO: 1,3, 
5 6, 8, 10, 12, 14, 16, IS, 20, 22, or 57 is one that is sufficiently complementary to the 

nucleotide sequence shown in SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57 that it 
can hydrogen bond with Uttle or no mismatches to the nucleotide sequence shown in SEQ ID 
NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57, thereby forming a stable duplex. 

As used herein, the temi "complementary" refers to Watson-Crick or Hoogsteen base 
10 pairing between nucleotide units of a nucleic acid molecule, and the term **binding" means the 
physical or chemical interaction between two polypeptides or compounds or associated 
polypeptides or compounds or combinations thereof Binding includes ionic, non-ionic^ Von 
der Waals, hydrophobic interactions, etc, A physical interaction can be either direct or 
indirect. Indirect interactions may be through or due to the effects of another polypeptide or 
15 compound. Direct binding refers to interactions that do not take place through, or due to, the 
effect of another polypeptide or compound, but instead are without other substantial chemical 
intermediates. 

Moreover, the nucleic acid molecule of the invention can comprise only a portion of 
the nucleic acid sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57, e.g., a 

20 fragment that can be used as a probe or primer, or a fragment encoding a biologically active 
portion of NOV-X. Fragments provided herein are defined as sequences of at least 6 
(contiguous) nucleic acids or at least 4 (contiguous) amino acids, a length sufficient to allow 
for specific hybridization in the case of nucleic acids or for specific recognition of an epitope 
- in the case of amino acids, respectively, and are at most some portion less than a fiiU length 

25 sequence. Fragments may be derived from any contiguous portion of a nucleic acid or amino 
acid sequence of choice. Derivatives are nucleic acid sequences or amino acid sequences 
formed from the native compoimds either directly or by modification or partial substitution. 
Analogs are nucleic acid sequences or amino acid sequences that have a structure similar to, 
but not identical to, the native compound but differs from it in respect to certain components 

30 or side chains. Analogs may be synthetic or from a different evolutionary origin and may have 
a similar or opposite metabolic activity compared to wild type. 

Derivatives and analogs may be full length or other than full length, if the derivative or . 
analog contains a modified nucleic acid or amino acid, as described below. Derivatives or 
analogs of the nucleic acids or proteins of the invention include, but are not limited to, 
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molecules comprising regions that are substantially homologous to the nucleic acids or 
proteins of the invention, in various embodiments, by at least about 70%, 80%, 85%, 90%, 
95%, 98%, or even 99% identity (with a preferred identity of 80-99%) over a nucleic acid or 
amino acid sequence of identical size or when compared to an aligned sequence in which the 
5 aligmnent is done by a computer homology program known in the art, or whose encoding 
nucleic acid is capable of hybridizing to the complement of a sequence encoding the 
aforementioned proteins under stringent, moderately stringent, or low stringent conditions. 
See e.g. Ausubel, et al, CURRENT PROTOCOLS IN MOLECULAR Biology, John Wiley & Sons, 
New York, NY, 1993, and below. An exemplary program is the Gap program (Wisconsin 
10 Sequence Analysis Package, Version 8 for UNIX, Genetics Computer Group, University 

Research Park, Madison, WI) using the default settings, which uses the algorithm of Smith and 
Waterman (Adv. Appl. Math., 1981, 2: 482-489, which is incorporated herein by referefice in 
its entirety). 

A "homologous nucleic acid sequence" or "homologous amino acid sequence," or 

15 variations thereof, refer to sequences characterized by a homology at the nucleotide level or 
amino acid level as discussed above. Homologous nucleotide sequences encode those 
sequences coding for isofomis of a NOV-X polypeptide. Isofomis can be expressed in 
different tissues of the same organism as a result of, for example, alternative splicing of RNA. 
Altematively, isofomis can be encoded by different genes. In the present invention, 

20 homologous nucleotide sequences include nucleotide sequences encoding for a NOV-X 

polypeptide of species other than humans, including, but not limited to, mammals, and thus 
can include, e.g., mouse, rat, rabbit, dog, cat cow, horse, and other organisms. Homologous 
nucleotide sequences also include, but are not limited to, naturally occurring allelic variations 
and mutations of the nucleotide sequences set forth herein. A homologous nucleotide 

25 sequence does not, however, include the nucleotide sequence encoding human NOV-X 

protein. Homologous nucleic acid sequences include those nucleic acid sequences that encode 
conservative amino acid substitutions (see below) in SEQ ID NO: 2, 4, 5, 7, 9, 11, 13, 15, 17, 
19, 21, or 23, as well as a polypeptide having NOV-X activity. Biological activities of the 
NOV-X proteins are described below. A homologous amino acid sequence does not encode 

30 the amino acid sequence of a human NOV-X polypeptide. 

The nucleotide sequence determined from the cloning of the human NOV-X gene 
allows for the generation of probes and primers designed for use in identifying and/or cloning 
NOV-X homologues in other cell types, e.g., from other tissues, as well as NOV-X 
homologues from other manunals. The probe/primer typically comprises a substantially 
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purified oligonucleotide. The oligonucleotide typically comprises a region of nucleotide 
sequence that hybridizes under stringent conditions to at least about 12, 25, 50, 100, 150, 200, 
250, 300, 350 or 400 or more consecutive sense strand nucleotide sequence of SEQ ID NO: 1, 
3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57; or an anti-sense strand nucleotide sequence of SEQ 
5 ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57; orof a naturally occurring mutant of SEQ 
ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57. 

Probes based on the human NOV-X nucleotide sequence can be used to detect 
transcripts or genomic sequences encoding the same or homologous proteins. In various 
embodiments, the probe fiirther comprises a label group attached thereto, e.g., the label group 

10 can be a radioisotope, a fluorescent compound, an enzyme, or an enzyme co-factor. Such 
probes can be used as a part of a diagnostic test kit for identifying cells or tissue which 
misexpress a NOV-X protein, such as by measuring a level of a NOV-X-encoding nucleic acid 
in a sample of cells from a subject e.g., detecting NOV-X mRNA levels or determining 
whether a genomic NOV-X gene has been mutated or deleted. 

15 A "polypeptide having a biologically active portion of NOV~X" refers to polypeptides 

exhibiting activity similar, but not necessarily identical to, an activity of a polypeptide of the 
present invention, including mature forms, as measured in a particular biological assay, with 
or without dose dependency. A nucleic acid fragment encoding a "biologically active portion 
of NOV-X" can be prepared by isolatmg a portion of SEQ ID NO: 1,3,6, 8, 10, 12, 14, 16, 

20 1 8, 20, 22, or 57 that encodes a polypeptide having a NOV-X biological activity (biological 
activities of the NOV-X proteins are described below), expressing the encoded portion of 
NOV-X protein (e.g., by recombinant expression in vitro) and assessing the activity of the 
encoded portion of NOV-X. For example, a nucleic acid fragment encoding a biologically 
active portion of NOV-X caji optionally include an ATP-binding domain. In another 

25 embodiment, a nucleic acid fragment encoding a biologically active portion of NOV-X 
includes one or more regions. 

NOV-X Variants 

The invention further encompasses nucleic acid molecules that differ from the 
30 nucleotide sequences shown in SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57 due to 
the degeneracy of the genetic code. These nucleic acids thus encode the same NOV-X protein 
as that encoded by the nucleotide sequence shown in SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 
20, 22, or 57 e.g., the polypeptide of SEQ ID NO: 2, 4, 5, 7, 9, 11, 13, 15, 17, 19, 21, or 23. In 
another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide 
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sequence encoding a protein having an amino acid sequence shown in SEQ ID NO: 2, 4, 5, 7, 

9, 11, 13, 15, 17, 19,21, or 23. 

In addition to the human NOV-X nucleotide sequence shown in SEQ ID NO: 1, 3, 6, 8, 

10, 12, 14, 16, 18, 20, 22, or 57, it will be appreciated by those skilled in the art that DNA 
" 5 sequence polymorphisms that lead to changes in the amino acid sequences of NOV-X may 
- exist within a population (e.g., the hmnan population). Such genetic polymorphism in the 

NOV-X gene may exist among individuals within a population due to natural allelic variation. 
As used herein, the terms "gene" and "recombinant gene" refer to nucleic acid molecules 
comprising an open reading frame encoding a NOV-X protein, preferably a mammalian NOV- 

10 X protein. Such natural allelic variations can typically result in 1-5% variance in the 

nucleotide sequence of the NOV-X gene. Any and all such nucleotide variations and resulting 
amino acid polymorphisms in NOV-X that are the result of natural allelic variation and that do 
not alter the functional activity of NOV-X are intended to be within the scope of the invention. 
Moreover, nucleic acid molecules encoding NOV-X proteins from other species, and 

15 thus that have a nucleotide sequence that differs from the human sequence of SEQ ID NO: 1, 
3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57 are intended to be within the scope of the invention. 
Nucleic acid molecules corresponding to natural allelic variants and homologues of the NOV- 
X cDNAs of the invention can be isolated based on their homology to the human NOV-X 
nucleic acids disclosed herein using the human cDNAs, or a portion thereof, as a hybridization 

20 probe according to standard hybridization techniques imder stringent hybridization conditions. 
For example, a soluble human NOV-X cDNA can be isolated based on its homology to human 
membrane-bovmd NOV-X. Likewise, a membrane-bound human NOV-X cDNA can be 
isolated based on its homology to soluble human NOV-X. 

Accordingly, in another embodiment, an isolated nucleic acid molecule of the 

25 invention is at least 6 nucleotides in length and hybridizes vmder stringent conditions to the 
nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 
14, 16, 18, 20, 22, or 57. In another embodiment, the nucleic acid is at least 10, 25, 50, 100, 
250, 500 or 750 nucleotides in length. In another embodiment, an isolated nucleic acid 
molecule of the invention hybridizes to the coding region. As used herein, the term 

30 "hybridizes imder stringent conditions" is intended to describe conditions for hybridization and 
washing imder which nucleotide sequences at least 60% homologous to each other typically 
remain hybridized to each other. 

Homologs (i.e., nucleic acids encoding NOV-X proteins derived from species other 
than human) or other related sequences (e.g., paralogs) can be obtained by low, moderate or 
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high stringency hybridization with all or a portion of the particular human sequence as a probe 
using methods well known in the art for nucleic acid hybridization and cloning. 

As used herein, the phrase "stringent hybridization conditions" refers to conditions 
under which a probe, primer or oligonucleotide will hybridize to its target sequence, but to no 
other sequences. Stringent conditions are sequence-dependent and will be different in 
different circumstances. Longer sequences hybridize specifically at higher temperatures than 
shorter sequences. Generally, stringent conditions are selected to be about 5^C lower than the 
thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. The 
Tm is the temperature (under defined ionic strength, pH and nucleic acid concentration) at 
which 50% of the probes complementary to the target sequence hybridize to the target 
sequence at equilibrium. Since the target sequences are generally present at excess, at Tm, 
50% of the probes are occupied at equilibrium. Typically, stringent conditions will be tliose in 
which the salt concentration is less than about 1.0 M sodium ion, typically about 0.01 to 1.0 M 
sodium ion (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short 
probes, primers or oligonucleotides (e.g., 10 nt to 50 nt) and at least about 60°C for longer 
probes, primers and oligonucleotides. Stringent conditions may also be achieved with the 
addition of destabilizing agents, such as formamide. 

Stringent conditions are known to those skilled in the art and can be found in CURRENT 
Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6. 
Preferably, the conditions are such that sequences at least about 65%, 70%, 75%, 85%, 90%, 
95%, 98%, or 99% homologous to each other typically remain hybridized to each other. 
A non-limiting example of stringent hybridization conditions is hybridization in a high salt 
buffer comprising 6X SSC, 50 mM Tris-HCl (pH 7.5), 1 mM EDTA, 0.02% PVP, 0.02% 
Ficoll, 0.02% BSA, and 500 mg^^ml denatured salmon sperm DNA at 65°C. This hybridization 
is followed by one or more washes in 0.2X SSC, 0.01% BSA at 50°C. An isolated nucleic 
acid molecule of the invention that hybridizes imder stringent conditions to the sequence of 
SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57 corresponds to a naturally occurring 
nucleic acid molecule. As used herein, a "naturally-occurring" nucleic acid molecule refers to 
an RNA or DNA molecule having a nucleotide sequence that occurs in nature (e.g., encodes a 
natural protein). 

In a second embodiment, a nucleic acid sequence that is hybridizable to the nucleic 
acid molecule comprising the nucleotide sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 
18, 20, 22, or 57, or fragments, analogs or derivatives thereof, imder conditions of moderate 
stringency is provided. A non-limiting example of moderate stringency hybridization 
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conditions are hybridization in 6X SSC, 5X Denhardfs solution, 0.5% SDS and 100 mg/ml 
denatured salmon sperm DNA at 55°C, followed by one or more washes in IX SSC, 0.1% 
SDS at 37°C. Other conditions of moderate stringency that may be used are well known in the 
art. See, e.g., Ausubel et al. (eds.), 1993, Current PROTOCOLS IN Molecular Biology, 
5 John Wiley & Sons, NY, and Kriegler, 1990, GENE TRANSFER AND EXPRESSION, A 
Laboratory Manual, Stockton Press, NY. 

In a third embodiment, a nucleic acid that is hybridizable to the nucleic acid molecule 
comprising the nucleotide sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 
57, or fragments, analogs or derivatives thereof, xmder conditions of low stringency, is 

10 provided. A non-limiting example of low stringency hybridization conditions are 

hybridization in 35% formamide, 5X SSC, 50 mM Tris-HCl (pH 7.5), 5 mM EDTA, 0.02% 
PVP, 0.02% FicoU, 0.2% BSA, 100 mg/ml denatured sahnon sperai DNA, 10% (wt/vol) 
dextran sulfate at 40°C, followed by one or more washes in 2X SSC, 25 mM Tris-HCl (pH 
7.4), 5 mM EDTA, and 0.1% SDS at 50°C. Other conditions of low stringency that may be 

15 used are well known in the art (e.g., as employed for cross-species hybridizations). See, e.g., 
Ausubel et al. (eds.), 1993, CURRENT Protocols IN MOLECULAR BIOLOGY, John Wiley & 
Sons, NY, and Kriegler, 1990, Gene Transfer and Expression, A Laboratory Manual, 
Stockton Press, NY; Shilo and Wemberg, 1981, Proc Natl Acad Sci USA 78: 6789-6792. 

20 Conservative mutations 

In addition to naturally-occurring allelic variants of the NOV-X sequence that may 
exist in the population, the skilled artisan will further appreciate that changes can be 
introduced by mutation into the nucleotide sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 
18, 20, 22, or 57, thereby leading to changes in the amino acid sequence of the encoded NOV- 

25 X protein, without altering the functional ability of the NOV-X protein. For example, 

nucleotide substitutions leading to amino acid substitutions at "non-essential" amino acid 
residues can be made in the sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 
57. A "non-essential" amino acid residue is a residue that can be altered from the wild-type 
sequence of NOV-X without altering the biological activity, whereas an "essential" amino acid 

30 residue is required for biological activity. For example, amino acid residues that are 

conserved among the NOV-X proteins of the present invention, are predicted to be particularly 
unamenable to alteration. 

Another aspect of the invention pertains to nucleic acid molecules encoding NOV-X 
proteins that contain changes in amino acid residues that are not essential for activity. Such 
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NOV-X proteins differ in amino acid sequence from SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 
19, 21, or 23, yet retain biological activity. In one embodiment, the isolated nucleic acid 
molecule comprises a nucleotide sequence encoding a protein, wherein the protein comprises 
an amino acid sequence at least about 75% homologous to the amino acid sequence of SEQ ID 
5 NO: 2, 4, 6, or 8. Preferably, the protein encoded by the nucleic acid is at least about 80% 

homologous to SEQ ID NO: 2, 4, 5, 7, 9, 11, 13, 15, 17, 19, 21, or 23, more preferably at least 
about 90%, 95%, 98%, and most preferably at least about 99% homologous to SEQ ID NO: 2, 
4,5,7,9, 11, 13, 15, 17, 19, 21, or 23. 

An isolated nucleic acid molecule encoding a NOV-X protein homologous to the 
10 protein of can be created by introducing one or more nucleotide substitutions, additions or 

deletions into the nucleotide sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 
57, such that one or more amino acid substitutions, additions or deletions are introduced into 
the encoded protein. 

Mutations can be introduced into the nucleotide sequence of SEQ ID NO: 1, 3,. 6, 8, 

15 10, 12, 14, 16, 18, 20, 22, or 57 by standard techniques, such as site-directed mutagenesis and 
PCR-mediated mutagenesis. Preferably, conservative amino acid substitutions are made at 
one or more predicted non-essential amino acid residues. A "conservative amino acid 
substitution" is one in which the amino acid residue is replaced with an amino acid residue 
having a similar side chain. FamiUes of amino acid residues having similar side chains have 

20 been defined in the art. These families include amino acids with basic side chains (e.g., lysine, 
arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), imcharged polar side 
chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar 
side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, 
tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side 

25 chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential 

amino acid residue in NOV-X is replaced with another amino acid residue jfrom the same side 
chain family. Alternatively, in another embodiment, mutations can be introduced randomly 
along all or part of a NOV-X coding sequence, such as by saturation mutagenesis, and the 
resultant mutants can be screened for NOV-X biological activity to identify mutants that retain 

30 activity. Following mutagenesis of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57 

the encoded protein can be expressed by any recombinant technology known in the art and the 
activity of the protein can be determined. 

In one embodiment, a mutant NOV-X protein can be assayed for (1) the ability to form 
protein:protein interactions with other NOV-X proteins, other cell-surface proteins, or 
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biologically active portions thereof, (2) complex formation between a mutant NOV-X protein 
and a NOV-X receptor; (3) the ability of a mutant NOV-X protein to bind to an intracellular 
target protein or biologically active portion thereof; (e.g., avidrn proteins); (4) the ability to 
bind NOV-X protein; or (5) the ability to specifically bind an anti-NOV-X protein antibody. 

5 

Antisense NOV-X Nucleic acids 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules 
that are hybridizable to or complementary to the nucleic acid molecule comprising the 
nucleotide sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57, or fragments, 

10 analogs or derivatives thereof An "antisense" nucleic acid comprises a nucleotide sequence 
that is complementary to a "sense" nucleic acid encoding a protein, e.g., complementary to the 
coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. 
In specific aspects, antisense nucleic acid molecules are provided that comprise a sequence 
complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an entire NOV-X 

15 coding strand, or to only a portion thereof. Nucleic acid molecules encoding fragments, 

homologs, derivatives and analogs of a NOV-X protein of SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 
15, 17, 19, 21, or 23 or antisense nucleic acids complementary to a NOV-X nucleic acid 
sequence of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57 are additionally provided. 
In one embodiment, an antisense nucleic acid molecule is antisense to a "coding 

20 . region" of the codmg strand of a nucleotide sequence encoding NOV-X. The term "coding 

region" refers to the region of the nucleotide sequence comprising codons which are translated 
into amino acid residues (e.g., the protein coding region of human NOV-X conresponds to 
SEQ ID NO: 2, 4, 5, 7, 9, 11, 13, 15, 17, 19, 21, or 23). In another embodiment, the antisense 
nucleic acid molecule is antisense to a "noncoding region" of the coding strand of a nucleotide 

25 sequence encoding NOV-X. The term "noncoding region" refers to 5' and 3' sequences which 
flank the coding region that are not translated into amino acids (i.e., also referred to as 5' and 
3' untranslated regions). 

Given the coding strand sequences encoding NOV-X disclosed herein (e.g., SEQ ID 
NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57), antisense nucleic acids of the invention can be 

30 designed according to the rules of Watson and Crick or Hoogsteen base pairing. The antisense 
nucleic acid molecule can be complementary to the entire coding region of NOV-X mRNA, 
• but more preferably is an oligonucleotide that is antisense to only a portion of the coding or 
noncoding region of NOV-X mRNA. For example, the antisense oligonucleotide can be 
complementary to the region surrounding the translation start site of NOV-X mRNA. An 
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antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 
nucleotides in length. An antisense nucleic acid of the invention can be constructed using 
chemical synthesis or enzymatic ligation reactions using procedures known in the art. For 
example, an antisense nucleic acid (e.g., an antisense oUgonucleotide) can be chemically 
5 synthesized using natvurally occurring nucleotides or variously modified nucleotides designed 
to increase the biological stability of tlie molecules or to increase the physical stability of the 
duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate 
derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic 

10 acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, 

xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylque6sine, 
inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-niethylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 

15 7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-maiuiosylqueosine, 5 -methoxycarboxymethyluracil, 5-methoxyuracil, 
2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

20 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed firom the 
inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described fiirther in the following subsection). 

25 The antisense nucleic acid molecules of the invention are typically administered to a 

subject or generated in situ such that they hybridize witli or bind to cellular. mRNA and/or 
genomic DNA encoding a NOV-X protein to thereby inhibit expression of the protein, e.g., by 
inhibiting transcription and/or translation. The hybridization can be by conventional 
nucleotide complementarity to fomi a stable duplex, or, for example, in the case of an 

30 antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 
nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
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such that they specifically bind to receptors or antigens expressed on a selected cell surface, 
e.g., by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell 
surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to 
cells using the vectors described herein. To achieve sufficient intracellular concentrations of 
5 antisense molecules, vector constmcts in which the antisense nucleic acid molecule is placed 
under the control of a strong pol n or pol EI promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
-a nomeric nucleic acid molecule. An -anomeric nucleic acid molecule forms specific 
double-stranded hybrids with complementary RNA in which, contrary to the usual -units, the 
10 strands run parallel to each other (Gaultier et al. (1987) Nucleic acids Res 15: 6625-6641). 

The antisense nucleic acid molecule can also comprise a 2 -o-methylribonucleotide (Inoue et 
al. (1987) Nucleic acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al. 

(1987) FEES Lett 215: 327-330). 
Such modifications include, by way of nordimiting example, modified bases, and 

1 5 nucleic acids whose sugar phosphate backbones are modified or derivatized. These 

modifications are carried out at least in part to enhance the chemical stabiUty of the modified 
nucleic acid, such that they may be used, for example, as antisense binding nucleic acids in 
therapeutic applications in a subject. 

NOV-X Ribozymes and PNA moieties 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of 
cleaving a single-stranded nucleic acid, such as a mRNA, to which they have a complementary 
region. Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach 

(1988) Nature 334:585-591)) can be used to catalytically cleave NOV-X mRNA transcripts to 
thereby inhibit translation of NOV-X mRNA. A ribozyme having specificity for a NOV- 
X-encoding nucleic acid can be designed based upon the nucleotide sequence of a NOV-X 
DNA disclosed herein (i.e., SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57). For 
example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the 
nucleotide sequence of the active site is complementary to the nucleotide sequence to be 
cleaved in a NOV-X-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and 
Cech et al. U.S. Pat. No. 5,116,742. Altematively, NOV-X mRNA can be used to select a 
catalytic RNA having a specific ribonuclease activity fi-om a pool of RNA molecules. See, 
e.g., Bartel et al., (1993) Science 261:1411-1418. 
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Alternatively, NOV-X gene expression can be inhibited by targeting nucleotide 
sequences complementary to the regulatory region of the NOV-X (e.g., the NOV-X promoter 
and/or enhancers) to form triple helical structures that prevent transcription of the NOV-X 
gene in target cells. See generally, Helene. (1991) Anticancer Drug Des. 6: 569-84; Helene. et 
5 al. (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher (1992) Bioassays 14: 807-15. 

In various embodunents, the nucleic acids of NOV-X can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al. (1996) Bioorg Med 

10 Chem 4: 5-23). As used herein, the temis "peptide nucleic acids" or "PNAs" refer to nucleic 
acid mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by 
a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 

15 standard solid phase peptide synthesis protocols as described in Hyrup et al. (1996) above; 
Peny-O'Keefe et al. (1996) PNAS 93: 14670-675. 

PNAs of NOV-X can be used in therapeutic and diagnostic applications. For example, 
PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene 
expression by, e.g., inducing transcription or translation arrest or inhibiting replication. PNAs 

20 of NOV-X can also be used, e.g., in the analysis of single base pair mutations in a gene by, 

e.g., PNA directed PGR clamping; as artificial restriction enzymes when used in combmation 
with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or primers for 
DNA sequence and hybridization (Hyrup et al. (1996), above; Perry-O'Keefe (1996), above). 
In another embodiment, PNAs of NOV-X can be m.odified, erg:, to enJiance their 

25 stabiUty or cellular uptake, by attaching Upophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras of NOV-X can be generated that 
may combine the advantageous properties of PNA and DNA. Such chimeras allow DNA 
recognition enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion 

30 while the PNA portion would provide high binding affinity and specificity. PNA-DNA 

chimeras can be linked using linkers of appropriate lengths selected in terms of base stacking, 
number of bonds between the nucleobases, and orientation (Hyrup (1996) above). The 
synthesis of PNA-DNA chimeras can be performed as described in Hymp (1996) above and 
Finn et al. (1996) Nucl Acids Res 24: 3357-63. For example, a DNA chain can be synthesized 
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on a solid support using standard phosphoramidite coupling chemistry, and modified 
nucleoside analogs, e.g., 5 -(4-methoxytrityl) amino-5 -deoxy-thymidine phosphoramidite, can 
be used between the PNA and the 5' end of DNA (Mag et al. (1989) Nucl Acid Res 17: 
5973-88). PNA monomers are then coupled in a stepwise manner to produce a chimeric 
5 molecule with a 5' PNA segment and a 3' DNA segment (Finn et al. (1 996) above). 

Alternatively, chimeric molecules can be synthesized with a 5' DNA segment and a 3' PNA 
segment. See, Petersen et al. (1975) Bioorg Med Chem Lett 5:1119-11 124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across 
10 the cell membrane (see, e.g., Letsinger et al., 1989, Proc. Natl. Acad. Sci. U.S.A. 

86:6553-6556; Lemaitre et al., 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT PubUcation No. 
W088/09810) or the blood-brain barrier (see, e.g., PCT Publication No. W0S9/10134). In 
addition, oligonucleotides can be modified with hybridization triggered cleavage agents (See, 
e.g., Krol et al., 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, 
15 Pharm. Res. 5: 539-549). To this end, the ohgonucleotide may be conjugated to another 
molecule, e.g., a peptide, a hybridization triggered cross-Unking agent, a transport agent, a 
hybridization-triggered cleavage agent, etc. 

NOV-X Polypeptides 

A NOV-X polypeptide of the invention includes the NOV-X-like protein whose 
sequence is provided in SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23. The invention 
also includes a mutant or variant protein any of whose residues may be changed from the 
correspondmg residue shown m SEQ ID NO: 2, 4, 5, 7, 9, 11, 13, 15, 17, 19, 21, or 23 while 
still encoding a protein that maintains its NOV-X-Uke activities and physiological functions, 
or a functional fragment thereof In some embodiments, up to 20% or more of the residues 
may be so changed in the mutant or variant protein. In some embodiments, the NOV-X 
polypeptide according to the invention is a mature polypeptide. 

In general, a NOV-X -like variant that preserves NOV-X-like function includes any 
variant in which residues at a particular position in the sequence have been substituted by 
other amino acids, and further include the possibility of inserting an additional residue or 
residues between two residues of the parent protein as well as the possibility of deleting one or 
more residues from the parent sequence. Any amino acid substitution, insertion, or deletion is 
encompassed by the invention. In favorable circumstances, the substitution is a conservative 
substitution as defined above. 
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One aspect of the invention pertains to isolated NOV-X proteins, and biologically 
active portions thereof, or derivatives, fragments, analogs or homologs thereof Also provided 
are polypeptide fragments suitable for use as iromunogens to raise anti-NOV-X antibodies. In 
one embodiment, native NOV-X proteins can be isolated from cells or tissue sources by an 
5 appropriate purification scheme using standard protein purification techniques. In another 
embodiment, NOV-X proteins are produced by recombinant DNA techniques. Altemative to 
recombinant expression, a NOV-X protein or polypeptide can be synthesized chemically using 
standard peptide synthesis techniques. 

An "isolated" or "purified" protein or biologically active portion thereof is substantially 

1 0 free of cellular material or other contaminating protehis from the cell or tissue source from 
which the NOV-X protein is derived, or substantially free from chemical precursors or other 
chemicals when chemically synthesized. The language "substantially free of cellular material" 
includes preparations of NOV-X protein in which the protein is separated from cellular 
components of the cells from which it is isolated or recombinantly produced. 

15 In one embodiment, the language "substantially free of cellular material" includes 

preparations of NOV-X protein having less than about 30% (by dry weight) of non-NOV-X 
protein (also referred to herein as a "contaminating protein"), more preferably less than about 
20% of non-NOV-X protein, still more preferably less than about 10% of non-NOV-X protein, 
and most preferably less than about 5% non-NOV-X protein. When the NOV-X protein or 

20 biologically active portion thereof is recombinantly produced, it is also preferably 

substantially free of culture medium, i.e., culture medium represents less than about 20%, 
more preferably less than about 10%, and most preferably less than about 5% of the volume of 
the protein preparation. 
- • - The language "substantially free of ch^ 

25 preparations of NOV-X protein in which the protein is separated from chemical precursors or 
other chemicals that are involved in the synthesis of the protein. In one embodiment, the 
language "substantially free of chemical precursors or other chemicals" includes preparations 
of NOV-X protein having less than about 30% (by dry weight) of chemical precursors or 
non-NOV-X chemicals, more preferably less than about 20% chemical precursors or 

30 non-NOV-X chemicals, still more preferably less than about 10% chemical precursors or 
non-NOV-X chemicals, and most preferably less than about 5% chemical precursors or 
non-NOV-X chemicals. 

Biologically active portions of a NOV-X protein include peptides comprising amino 
acid sequences sufficiently homologous to or derived from the amino acid sequence of the 
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NOV-X protein, e.g., the amino acid sequence shown in SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 
17, 19, 21, or 23 that include fewer amino acids than the full length NOV-X proteins, and 
exhibit at least one activity of a NOV-X protein. Typically, biologically active portions 
comprise a domain or motif with at least one activity of the NOV-X protein. A biologically 
5 active portion of a NOV-X protein can be a polypeptide which is, for example, 10, 25, 50, 100 
or more amino acids in length. 

A biologically active portion of a NOV-X protein of the present invention may contain 
at least one of the above-identified domains conserved between the NOV-X proteins, e.g. 
TSR modules. Moreover, other biologically active portions, in which other regions of the 

10 protein are deleted, can be prepared by recombinant techniques and evaluated for one or more 
of the functional activities of a native NOV-X protein. 

In an embodiment, the NOV-X protein has an amino acid sequence shown in SBQ ID 
NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23. In other embodiments, the NOV-X protein is 
substantially homologous to SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23 and retains 

15 thefunctionalactivityoftheproteinofSEQIDNO:2,4, 5, 7, 9, 11, 13, 15, 17, 19,21, or 23 
yet differs in amino acid sequence due to natural allelic variation or mutagenesis, as described 
in detail below. Accordingly, in another embodiment, the NOV-X protein is a protein that 
comprises an amino acid sequence at least about 45% homologous to the amino acid sequence 
of SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23 and retains the functional activity of 

20 the NOV-X proteins ofSEQ ED NO: 2, 4, 5, 7, 9, 11, 13, 15, 17, 19,21, or 23. 

Determining homology between two or more sequence 

To detemiine the percent homology of two amino acid sequences or of two nucleic 
acids, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced 

25 in either of the sequences being compared for optimal alignment between the sequences). The 
amino acid residues or nucleotides at corresponding amino acid positions or nucleotide 
positions are then compared. When a position in the first sequence is occupied by the same 
amino acid residue or nucleotide as the corresponding position in the second sequence, then 
the molecules are homologous at that position (i.e., as used herein amino acid or nucleic acid 

30 "homology" is equivalent to amino acid or nucleic acid "identity"). 

The nucleic acid sequence homology may be determined as the degree of identity 
between two sequences. The homology may be determined using computer programs known 
in the art, such as GAP software provided in the GCG program package. See, Needleman and 
Wunsch 1970 J Mol Biol 48: 443-453. Using GCG GAP software with the following settings 
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for nucleic acid sequence comparison: GAP creation penalty of 5.0 and GAP extension 
penalty of 0.3, the coding region of the analogous nucleic acid sequences referred to above 
exhibits a degree of identity preferably of at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 
99%, with the CDS (encoding) part of the DNA sequence shown in SEQ ID NO: 1, 3, 6, 8, 

5 10, 12, 14, 16, 18, 20, 22, or 57. 

The term "sequence identity" refers to the degree to which two polynucleotide or 
polypeptide sequences are identical on a residue-by-residue basis over a particular region of 
comparison. The term "percentage of sequence identity" is calculated by comparing two 
optimally aligned sequences over that region of comparison, detemiining the number of 

10 positions at which the identical nucleic acid base (e.g.. A, T, C, G, U, or I, in the case of 

nucleic acids) occurs in both sequences to yield the number of matched positions, dividing the 
number of matched positions by the total number of positions in the region of comparison (i.e., 
the window size), and multiplying the result by 100 to yield the percentage of sequence 
identity. The term "substantial identity" as used herein denotes a characteristic of a 

15 polynucleotide sequence, wherein the polynucleotide comprises a sequence that has at least 80 
percent sequence identity, preferably at least 85 percent identity and often 90 to 95 percent 
sequence identity, more usually at least 99 percent sequence identity as compared to a 
reference sequence over a comparison region. The term "percentage of positive residues" is 
calculated by comparing two optimally aligned sequences over that region of comparison, 

20 detemiining the number of positions at which the identical and conservative amino acid 
substitutions, as defmed above, occur in both sequences to yield the number of matched 
positions, dividing the number of matched positions by the total number of positions in the 
region of comparison (i.e., the window size), and multiplying the result by 100 to yield the 
percentage of positive residues. 

25 

Chimeric and fusion proteins 

The invention also provides NOV-X chimeric or fusion proteins. As used herein, a 
NOV-X "chimeric protein" or "ftision protein" comprises a NOV-X polypeptide operatively 
linked to a non-NOV-X polypeptide. An "NOV-X polypeptide" refers to a polypeptide having 
30 an amino acid sequence corresponding to NOV-X, whereas a "non-NOV-X polypeptide" 

refers to a polypeptide having an amino acid sequence corresponding to a protein that is not 
substantially homologous to the NOV-X protein, e.g., a protein that is different from the 
NOV-X protein and that is derived from the same or a different organism. Within a NOV-X 
fusion protein the NOV-X polypeptide can correspond to all or a portion of a NOV-X protein. 
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.In.one embodiment, aNOV-X fusion protein comprises at least one biologically active portion 
of.a NOV-X protein. In another embodiment, a NOV-X fusion protein comprises at least two 
biologically active portions of a NOV-X protein. Within the fusion protein, the term 
"operatively linked" is intended to indicate that the NOV-X polypeptide and the non-NOV-X 
5 polypeptide are fused in-frame to each other. The non-NOV-X polypeptide can be fused to 
the N-tenninus or C-terminus of the NOV-X polypeptide. 

For example, in one embodiment a NOV-X fiision protein comprises a NOV-X 
polyp^tide operably linked to tiie extracellular domain of a second protein. Such fusion 
proteins can be further utilized in screening assays for compounds that modulate NOV-X 
1 0 activity (such assays are described in detail below). 

In another embodiment, the fusion protein is a GST-NOV-X fusion protein in which 
the NOV-X sequences are fused to the C-teraiinus of the GST (i.e., glutathione S-transferase) 
sequences. Such fusion proteins can facilitate the pvirification of recombinant NOV-X. 

In another embodiment, the fusion protein is a NOV-X-immunoglobulin fusion protein 
1 5 in which the NOV-X sequences comprising one or more domains are fused to sequences 

derived from a member of the immunoglobulin protein family. The NOV-X-immunoglobulin 
fusion proteins of the invention can be incorporated mto pharmaceutical compositions and 
administered to a subject to inhibit an interaction between a NOV-X ligand and a NOV-X 
protein on the surface of a cell, to thereby suppress NOV-X-mediated signal transduction in 
20 vivo. In one nonlimiting example, a contemplated NOV-X ligand of the invention is the 
NOV-X receptor. The NOV-X-immunoglobulin fusion proteins can be used to affect the 
bioavailability of a NOV-X cognate Hgand. Inhibition of the NOV-X ligand/NOV-X 
interaction may be useful therapeutically for botti the treatment of proliferative and 
differentiative disorders, e.g., cancer as well as modulating (e.g., promotmg or inhibiting) cell 
25 survival, as well as acute and chronic inflammatory disorders and hyperplastic wound healing, 
e.g. hypertrophic scars and keloids. Moreover, the NOV-X-immunoglobuUn fusion proteins 
of the invention can be used as immunogens to produce anti-NOV-X antibodies in a subject, to 
purify NOV-X hgands, and in screening assays to identify molecules that inhibit the 
interaction of NOV-X with a NOV-X ligand. 

A NOV-X chimeric or fiision protein of the invention can be produced by standard 
recombinant DNA techniques. For example, DNA fragments coding for the different 
polypeptide sequences are ligated togeflier in-fi^e in accordance with conventional 
techniques, e.g., by employing blunt-ended or stagger-ended termini for hgation, restriction 
enzyme digestion to provide for appropriate teraiini, fiUing-in of cohesive ends as appropriate. 
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alkaline phosphatase treatment to avoid undesirable joining, and enzymatic ligation. In 
another embodiment, the fusion gene can be synthesized by conventional techniques including 
automated DNA synthesizers. Alternatively, PGR amplification of gene fi-agments can be 
carried out using anchor primers that give rise to complementary overhangs between two 
5 consecutive gene fragments that can subsequently be annealed and reamplified to generate a 
chimeric gene sequence (see, for example, Ausubel et al. (eds.) Current Protocols in 
Molecular Biology, John Wiley & Sons, 1992). Moreover, many expression vectors are 
commercially available that aheady encode a fusion moiety (e.g., a GST polypeptide). A 
NOV-X-encoding nucleic acid can be cloned into such an expression vector such that the 
1 0 fusion moiety is linked in-frame to the NOV-X protein. 

NOV-X agonists and antagonists ^ 

The present invention also pertains to variants of the NOV-X proteins that function as 
either NOV-X agonists (mimetics) or as NOV-X antagonists. Variants of the NOV-X protein 

15 can be generated by mutagenesis, e.g., discrete point mutation or truncation of the NOV-X . 

protein. An agonist of the NOV-X protein can retain substantially the same, or a subset of, the. 
biological activities of the naturally occurring form of the NOV-X protein. An antagonist of 
the NOV-X protein can inhibit one or more of the activities of the naturally occurring form of 
the NOV-X protein by, for example, competitively binding to a downstream or upstream 

20 member of a cellular signaling cascade which includes the NOV-X protein. Thus, specific 
biological effects can be elicited by treatment with a variant of limited function. In one 
embodiment, treatment of a subject with a variant having a subset of the biological activities 
of the naturally occurring form of the protein has fewer side effects in a subject relative to 
treatment with the naturally occurring form of the NOV-X proteins. 

25 Variants of the NOV-X protein that function as either NOV-X agonists (mimetics) or 

as NOV-X antagonists can be identified by screening combinatorial hbraries of mutants, e.g., 
truncation mutants, of the NOV-X protein for NOV-X protem agonist or antagonist activity. 
In one embodiment, a variegated library of NOV-X variants is generated by combinatorial 
mutagenesis at the nucleic acid level and is encoded by a variegated gene library. A 
30 variegated library of NOV-X variants can be produced by, for example, enzymatically ligating 
a mixture of synthetic oligonucleotides into gene sequences such that a degenerate set of 
potential NOV-X sequences is expressible as individual polypeptides, or alternatively, as a set 
of larger fusion proteins (e.g., for phage display) containing the set of NOV-X sequences 
therein. There are a variety of methods which can be used to produce libraries of potential 
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NOV-X variants from a degenerate oligonucleotide sequence. Chemical synthesis of a 
degenerate gene sequence can be performed in an automatic DNA synthesizer, and the 
synthetic gene then ligated into an appropriate expression vector. Use of a degenerate set of 
genes allows for the provision, in one mixture, of all of the sequences encoding the desired set 
5 of potential NOV-X sequences. Methods for synthesizing degenerate oligonucleotides are 
known in the art (see, e.g., Narang (1983) Tetrahedron 39:3; Itakura et al. (1984) Annu Rev 
Biochem 53:323; Itakura et al. (1984) Science 198:1056; Ike et al (1983) Nucl Acid Res 
11:477. 



1 0 Polypeptide libraries 

In addition, libraries of fragments of the NOV-X protein coding sequence can be used 
to generate a variegated population of NOV-X fragments for screening and subsequent ^ 
selection of variants of a NOV-X protein. In one embodiment, a library of coding sequence 
fragments can be generated by treating a double stranded PGR fragment of a NOV-X coding 

15 sequence with a nuclease under conditions wherein nicking occurs only about once per 

molecule, denaturing the double stranded DNA, renaturing the DNA to form double stranded 
DNA that can include sense/antisense pairs from different nicked products, removing single 
stranded portions from reformed duplexes by treatment with SI nuclease, and ligating the 
resulting fragment library into an expression vector. By this method, an expression Ubrary can 

20 be derived which encodes N-terminal and intemal fragments of various sizes of the NOV-X 
protein. 

Several techniques are known in the art for screening gene products of combinatorial 
libraries made by point mutations or truncation, and for screening cDNA libraries for gene 
products having a selected property. Such techniques are adaptable for rapid screening of the 

25 gene libraries generated by the combinatorial mutagenesis of NOV-X proteins. The most 

widely used techniques, which are amenable to high throughput analysis, for screening large 
gene libraries typically include cloning the gene library into replicable expression vectors, 
transforming appropriate cells with the resulting library of vectors, and expressing the 
combinatorial genes under conditions in which detection of a desired activity facilitates 

30 isolation of the vector encoding the gene whose product was detected. Recrusive ensemble 

mutagenesis (REM), a new technique that enhances the frequency of fimctional mutants in the 
libraries, can be used in combination with the screening assays to identify NOV-X variants 
(Arkin and Yourvan (1992) PNAS 89:781 1-7815; Delgrave et al. (1993) Protein Engineering 
6:327-331). 
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NOV-X Antibodies 

Also included in the invention are antibodies to NOV-X proteins, or fragments of 
NOV-X proteins. The term "antibody" as used herein refers to immunoglobulin molecules 
5 and immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that 
contain an antigen binding site that specifically binds (inmumoreacts with) an antigen. Such 
antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single chain. Fab, 
Fab* and F(ab72 fragments, and an Fab expression library. In general, an antibody molecule 
obtained from humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ 

1 0 from one another by the nature of the heavy chain present in the molecule. Certain classes 
have subclasses as well, such as IgGi, IgGi, and others. Furthemiore, in humans, the Ught 
chain may be a kappa chain or a lambda chain. Reference herein to antibodies includes^a 
reference to all such classes, subclasses and types of human antibody species. 

An isolated NOV-X-related protein of the invention may be intended to serve as an 

15 antigen, or a portion or fragment thereof, and additionally can be used as an immunogen to 
generate antibodies that immunospecifically bind the antigen, using standard techniques for 
polyclonal and monoclonal antibody preparation. The ftiU-length protein can be used or, 
alternatively, the invention provides antigenic peptide fragments of the antigen for use as 
immunogens. An antigenic peptide fragment comprises at least 6 amino acid residues of the 

20 amino acid sequence of the fiill length protein, such as an amino acid sequence shown in SEQ 
ID NO: 2, 4, 6 , 8 ,10, 12, 14, 16, 18, or 20, and encompasses an epitope thereof such that an 
antibody raised against the peptide forms a specific innmune complex with the ftiU length 
protein or with any fragment that contains the epitope. Preferably, the antigenic peptide 
comprises at least 10 amtQo acid re-sidues, or at least 15 amino acid residues, or at least 20 

25 amino acid residues, or at least 30 amino acid residues. Preferred epitopes encompassed by 

the antigenic peptide are regions of the protein that are located on its surface; commonly these 
are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of NOV-X-related protein that is located on the surface of the 

30 protein, e.g., a hydrophilic region. A hydrophobicity analysis of the human NOV-X-related 
protein sequence will indicate which regions of a NOV-X-related protein are particularly 
hydrophilic and, therefore, are likely to encode surface residues usefiil for targeting antibody 
production. As a means for targeting antibody production, hydropathy plots showing regions 
of hydrophilicity and hydrophobicity may be generated by any method well known in the art, 
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including, for example, the Kyte Doolittle or the Hopp Woods methods, either with or without 
Fourier transformation. See, e.g., Hopp and Woods, 1981, Proc, Nat. Acad. Sci. USA 78: 
3824«382S; Kyte and Doolittle 1982, J. Mol. Biol. 157: 105-142, each of which is incorporated 
herein by reference in its entirety. Antibodies that are specific for one or more domains within 
5 an antigenic protein, or derivatives, fragments, analogs or homologs thereof, are also provided 
herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utiUzed as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 
1 0 Various procedures known within the art may be used for the production of polyclonal 

or monoclonal antibodies directed against a protein of the invention, or against derivatives, 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

15 

Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 
goat, mouse or other mammal) may be inmiunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 

20 immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically sjoithesized polypeptide representing the immunogenic protein, or a 
recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated 
to a second protein known to be immunogenic in the mammal being immunized. Examples of 
such immunogenic proteins include but are not hnaited to keyhole limpet hemocyanin, serum 

25 albumin, bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can fiirther 

include an adjuvant. Various adjuvants used to increase the immunological response include, 
but are not limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum 
hydroxide), surface active substances (e.g., lysolecithin, pluronic polyols, polyanions, 
peptides, oil emulsions, dinitrophenol, etc.), adjuvants usable in humans such as Bacille 

30 Cahnette-Guerin and Corynebacterium parvum, or similar immimostimulatory agents. 
Additional examples of adjuvants which can be employed include MPL-TDM adjuvant 
(monophosphoryl Lipid A, synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immimogenic protein can be 
isolated from the mammal (e.g., from the blood) and fiirther purified by well known 
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techniques, such as affinity chromatography using protein A or protein G, which provide 
primarily the IgG fraction of immune serum. Subsequently, or alternatively, the specific 
antigen which is the target of the immxmoglobulin sought, or an epitope thereof, may be 
immobilized on a colmnn to purify the immune specific antibody by inmiimoaffinity 
chromatography. Purification of immimoglobulins is discussed, for example, by D. Wilkinson 
(The Scientist, published by The Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 
2000), pp. 25-28). 

Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as 
used herein, refers to a population of antibody molecules that contain only one molecular 
species of antibody molecule consisting of a unique Ught chain gene product and a uniqlie 
heavy chain gene product. In particular, the complementarity determining regions (CDRs) of 
the monoclonal antibody are identical in all the molecules of the population. MAbs thus 
contain an antigen binding site capable of inmiimoreacting with a particular epitope of the 
antigen characterized by a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature, 256:495 (1975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically 
bind to the inmiunizing agent. Alternatively, the lymphocytes can be inunxmized in vitro. 

The immunizing agent will typically include the protein antigen, a fragment thereof or 
a fusion protein thereof Generally, either peripheral blood lymphocytes are used if cells of 
human origin are desired, or spleen cells or lymph node cells are used if non-human— 
mammalian sources are desired. The lymphocytes are then fiised with an immortalized cell 
line using a suitable fusing agent, such as polyethylene glycol, to form a hybridoma cell 
(Goding, Monoclonal Antibodies: Principles and Practice , Academic Press, (1986) pp. 59- 
103). Immortalized cell lines are usually transformed manmaalian cells, particularly myeloma 
cells of rodent, bovine and human origin. Usually, rat or mouse myeloma cell lines are 
employed. The hybridoma cells can be cultured in a suitable culture mediima that preferably 
contains one or more substances that inhibit the growth or survival of the unfused, 
immortalized cells. For example, if the parental cells lack the enzyme hypoxanthine guanine 
phosphoribosyl transferase (HGPRT or HPRT), the culture medium for the hybridomas 
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typically will include hypoxanthine, aminopterin, and thymidine ("HAT medium"), which 
substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high 
level expression of antibody by the selected antibody-producing cells, and are sensitive to a 
5 medium such as HAT medium. More preferred immortalized cell lines are murine myeloma 
lines, which can be obtained, for instance, from the Salk Institute Cell Distribution Center, San 
Diego, California and the American Type Culture Collection, Manassas, Virginia. Human 
myeloma and mouse-himaan heteromyeloma cell lines also have been described for the 
production of hmnan monoclonal antibodies (Kozbor, J. Immunol. . 133 :3001 (1984); Brodeur 

10 et.aL, Monoclonal Antibodv Production Techniques and Applications . Marcel Dekker, Inc., 
New York, (1987) pp. 51-63). 

The cultme medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 

15 ixiununoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 

enzyme-Unked immunoabsorbent assay (ELISA). Such techniques and assays are known in 
the art; The binding affinity of the monoclonal antibody can, for example, be determined by 
the Scatchard analysis of Munson and Pollard, Anal. Biochem. . 107:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 

20 are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by 
limiting dilution procediures and grown by standard methods. Suitable culture media for this 
purpose include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown iv vivo as ascites in a mammal. 

25 The monoclonal antibodies secreted by the subclones can be isolated or purified from 

the culture medium or ascites fluid by conventional immunoglobulin purification procedures 
such as, for example, protein A-Sepharose, hydroxylapatite chromatography, gel 
electrophoresis, dialysis, or affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 

30 those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of 
the invention can be readily isolated and sequenced using conventional procedures (e.g., by 
using oligonucleotide probes that are capable of binding specifically to genes encoding the 
heavy and light chains of murine antibodies). The hybridoma cells of the invention serve as a 
preferred source of such DNA. Once isolated, the DNA can be placed into expression vectors, 
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which are then transfected into host cells such as simian COS cells, Chinese hamster ovary 
(CHO) cells, or myeloma cells that do not otherwise produce immunoglobulin protein, to 
obtain the synthesis of monoclonal antibodies in the recombinant host cells. The DNA also 
can be modified, for example, by substituting the coding sequence for himian heavy and light 
5 chain constant domains in place of the homologous murine sequences (U.S. Patent No. 
4,816,567; Morrison, Nature 368. 812-13 (1994)) or by covalently joining to the 
immunoglobulin coding sequence all or part of the coding sequence for a non-immunoglobulin 
polypeptide. Such a non-immunoglobulin polypeptide can be substituted for the constant 
domains of an antibody of the invention, or can be substituted for the variable domains of one 
10 antigen-combining site of an antibody of the invention to create a chimeric bivalent antibody. 

Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further 
comprise humanized antibodies or human antibodies. These antibodies are suitable for 

15 administration to humans without engendering an immime response by the human against the 
administered inmiunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab')2 or other antigen- 
binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 

20 Hvunanization can be performed foUowmg the method of Winter and co-workers 

(Jones et al.. Nature , 321 :522-525 (1986); Riechmann et al.. Nature. 332:323-327 (1988); 
Verhoeyen et al., Science , 239:1534-1536 (1988)), by substituting rodent CDRs or CDR 
sequences for the corresponding sequences of ahmnan antibody. (See also U.S. Patent No. 
5,225 ,5 39.) In some instances, Fv framework residues of the human immunoglobuUn are 

25 replaced by corresponding non-human residues. Humanized antibodies can also comprise 
residues which are found neither in the recipient antibody nor in the imported CDR or 
framework sequences. In general, the humanized antibody will comprise substantially all of at 
least one, and typically two, variable domains, in which all or substantially all of the CDR 
regions correspond to those of a non-human immunoglobulin and all or substantially all of the 

30 framework regions are those of a human immunoglobulin consensus sequence. The 

humanized antibody optimally also will comprise at least a portion of an immunoglobulin 
constant region (Fc), typically that of a human immimoglobulin (Jones et al., 1986; 
Riechmann et al., 1988; and Presta, Curr. O p. Stmct. BioL , 2:593-596 (1992)). 
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Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from hxmian 
genes. Such antibodies are termed "human antibodies", or "fiilly human antibodies" herein. 
5 Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 

hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: MONOCLONAL 
ANTrooDDES AND CANCER THERAPY, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
antibodies may be utilized in the practice of the present invention and may be produced by 

10 using human hybridomas (see Cote, et aL, 1983. Proc Natl Acad Sci USA 80: 2026-2030) or 
by transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et aL, 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 
including phage display libraries (Hoogenboom and Winter, J. Mol. BioL , 227:381 (1991); 

15 Marks et al., J. Mol. BioL , 222:58 1 (1991)). Similarly, human antibodies can be made by 
introducing human immimoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, hxmian antibody production is observed, which closely resembles that seen in 
humans in all respects, including gene rearrangement, assembly, and antibody repertoire. This 

20 approach is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 

5,625,126; 5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10, 779-783 (1992)); 
Lonberg et al. ( Nature 368 856-859 (1994)); Morrison ( Nature 368 , 812-13 (1994)); Fishwild 
et al,( Nature Biot echnoloev 14, 845-51 (1996)); Neuberger (Nature Biotechnologv 14, 826 
(1996)); and Lonberg and Huszar f intem. Rev. Immunol. 13 65-93 (1995)). 

25 Human antibodies may additionally be produced using transgenic nonhuman animals 

which are modified so as to produce fully human antibodies rather than the animal's 
endogenous antibodies in response to challenge by an antigen. (See PCT publication 
WO94/02602). The endogenous genes encoding the heavy and light immxmoglobulin chains in 
the nonhuman host have been incapacitated, and active loci encoding human heavy and light 

30 chain immunoglobulins are inserted into the host's genome. The hxmian genes are 

incorporated, for example, using yeast artificial chromosomes containing the requisite human 
DNA segments. An animal which provides all the desired modifications is then obtained as 
progeny by crossbreeding intermediate transgenic animals containing fewer than the fiill 
complement of the modifications. The preferred embodiment of such a nonhuman animal is a 
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mouse, and is termed the Xenomouse™ as disclosed in PCT publications WO 96/33735 and 
WO 96/34096. This animal produces B cells which secrete fully human immxmoglobulins. 
The antibodies can be obtained directly firom the animal after immunization with an 
inmiunogen of interest, as, for example, a preparation of a polyclonal antibody, or alternatively 
5 from immortalized B cells derived from the animal, such as hybridomas producing 

monoclonal antibodies. Additionally, the genes encoding the immxmoglobulins with human 
variable regions can be recovered and expressed to obtain the antibodies directly, or can be 
further modified to obtain analogs of antibodies such as, for example, single chain Fv 
molecules. 

10 An example of a method of producing a nonhuman host, exemplified as a mouse, 

lacking expression of an endogenous immxmoglobulin heavy chain is disclosed in U.S. Patent 
No. 5,939,598. It can be obtained by a method including deleting the J segment genes ^om at 
least one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of 
the locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain 

15 locus, the deletion being effected by a targeting vector containing a gene encoding a selectable 
marker; and producing from the embryonic stem cell a transgenic mouse whose somatic and 
germ cells contain the gene encoding the selectable marker. 

A method for producing ah antibody of interest, such as a human antibody, is disclosed 
in U.S. Patent No. 5,916,771. It includes introducmg an expression vector that contains a 

20 nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, 

introducing an expression vector containing a nucleotide sequence encoding a light chain into 
another mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell 
expresses an antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a method for identifying a climcally 

25 relevant epitope on an immunogen, and a correlative method for selecting an antibody that 
binds immunospecifically to the relevant epitope with high affinity, are disclosed in PCT 
publication WO 99/53049. 

Fab Fragments and Single Chain Antibodies 
30 According to the invention, techniques can be adapted for the production of 

single-chain antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent 
No. 4,946,778). In addition, methods can be adapted for the construction of Fab expression 
Ubraries (see e.g., Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective 
identification of monoclonal Fab firagments with the desired specificity for a protein or 
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derivatives, fragments, analogs or homologs thereof. Antibody fragments that contain the 
idiotypes to a protein antigen may be produced by techniques known in the art including, but 
not limited to: (i) an F(ab»)2 fragment produced by pepsin digestion of an antibody molecule; (ii) 
an Fab fragment generated by reducing the disulfide bridges of an F(ab')2 fragment; (iii) an Fab 
5 fragment generated by the treatment of the antibody molecule with papain and a reducing 
agent and (iv) Fy fragments. 

Bispecific Antibodies 

^ Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 

10 have binding specificities for at least two different antigens. In the present case, one of the 
binding specificities is for an antigenic protein of the invention. The second binding target is 
any other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 

15 immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 
assortment of imniunoglobuhn heavy and light chains, these hybridomas (quadromas) produce 
a potential iriixture of ten different antibody molecules, of wliich only one has the correct 
bispecific stmcture. The purification of the correct molecule is usually accomplished by 

20 affinity chromatography steps. Similar procedures are disclosed in WO 93/08829, published 
13 May 1993, and in Traunecker et al., 1991 EMBO J., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part 

25 of the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant 

region (CHI) containing the site necessary for light-chain binding present in at least one of the 
fusions. . DNAs encoding the immimoglobuUn heavy-chain fusions and, if desired, the 
immunoglobulin light chain, are inserted into separate expression vectors, and are co- 
transfected into a suitable host organism. For further details of generating bispecific 

30 antibodies see, for example, Suresh et al.. Methods in Enzvmology . 121 :210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which 
are recovered from recombinant cell culture. The preferred interface comprises at least a part 
oftheCH3 region of an antibody constant domain. In this method, one or more small amino 
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acid side chains from the interface of the first antibody molecule are replaced with larger side 
chains (e.g. tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the 
large side chain(s) are created on the interface of the second antibody molecule by replacing 
large amino acid side chains with smaller ones (e.g. alanine or threonine). This provides a 
5 mechanism for increasing the yield of the heterodimer over other unwanted end-products such 
as homodimers. 

Bispecific antibodies can be prepared as fiill length antibodies or antibody fragments (e.g. 
F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 

10 prepared using chemical linkage. Brennan et al.. Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 
fragments are reduced in the presence of the dithiol complexing agent sodium arsenite tb 
stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 

15 derivatives is then reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 

20 coupled to fomi bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) 

describe the production of a fiiUy humanized bispecific antibody F(ab')2 molecule. Each Fab' 
fragment was separately secreted from E. coli and subjected to directed chemical coupling in 
vitro to form the bispecific antibody. The bispecific antibody thus formed was able to bind to 
cells overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic 

25 activity of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly 
from recombinant cell culture have also been described. For example, bispecific antibodies 
have been produced using leucine zippers. Kostelny et al., J. Immunol. 148(5): 1547- 1553 
(1992). The leucine zipper peptides from the Fos and Jim proteins were linked to the Fab' 

30 portions of two different antibodies by gene fiision. The antibody homodimers were reduced 
at the hinge region to form monomers and then re-oxidized to form the antibody heterodimers. 
This method can also be utilized for the production of antibody homodimers. The "diabody" 
technology described by HoUinger et al., Proc. Natl. Acad. Sci. USA 90:6444-6448 (1993) has 
provided an alternative mechanism for making bispecific antibody fragments. The firagments 
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comprise a heavy-chain variable domain (Vh) connected to a light-chain variable domain (Vl) 
by a linker which is too short to allow pairing between the two domains on the same chain. 
Accordingly, the Vh and Vl domains of one fragment are forced to pair with the 
complementary Vl and Vh domains of another fragment, thereby forming two antigen-binding 
5 sites. Another strategy for making bispecific antibody fragments by the use of single-chain Fv 
(sFv) dimers has also been reported. See, Gmber et al., J. Immunol. 152:5368 (1994). 
Antibodies with more than two valencies are contemplated. For example, trispecific 
antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of 

10 which originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm 
of an immunoglobuUn molecule can be combined with an arm which binds to a triggering 
molecule on a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or 
Fc receptors for IgG (Fc R), such as Fc RI (CD64), Fc RH (CD32) and Fc RIH (CD16) so as 
to focus cellular defense mechanisms to the cell expressing the particular antigen. Bispecific 

15 antibodies can also be used to direct cytotoxic agents to cells which express a particular 

antigen. These antibodies possess an antigen-binding arm and an arm which binds a cytotoxic 
agent or a radionuclide chelator, such as EOTUBE, DPT A, DOTA, or TETA. Another 
bispecific antibody of interest binds the protein antigen described herein and further binds 
tissue factor (TF). 

20 

Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such 
antibodies have, for example, been proposed to target immune system cells to unwanted cells 

25 (U.S. Patent No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 

92/200373; EP 03089). It is contemplated that the antibodies can be prepared in vitro using 
known methods in synthetic protein chemistry, including those involving crosslinking agents. 
For example, immimotoxins can be constructed using a disulfide exchange reaction or by 
forming a thioether bond. Examples of suitable reagents for this purpose include iminothiolate 

30 and methyl-4-mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 
4,676,980. 

Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector 
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function, so as to enhance, e.g., the effectiveness of the antibody in treating cancer. For 
example, cysteine residue(s) can be introduced into the Fc region, thereby allowing interchain 
disulfide bond formation in this region. The homodimeric antibody thus generated can have 
improved internalization capability and/or increased complement-mediated cell killing and 
5 antibody-dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176: 1191- 
1195 (1992) and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with 
enhanced anti-tumor activity can also be prepared using heterobifimctional cross-linkers as 
described in Wolff et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody 
can be engineered that has dual Fc regions and can thereby have enhanced complement lysis 
10 and ADCC capabilities. See Stevenson et al., Anti-Cancer Drag Design, 3: 219-230 (1989). 

Immunoconjugates ^ 

The invention also pertains to immunoconjugates comprising an antibody conjugated 
to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active 
toxin of bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive 

15 isotope (i.e., a radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have 
been described above. Enzymatically active toxins and jfragments thereof that can be used 
include diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain 
(from Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 

20 Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, 
gelonin, mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 
radionucUdes are available for the production of radioconjugated antibodies. Examples 
include 2»^Bi, '''l, »^'ln. '^Y, and ^"^Re. 

25 Conjugates of the antibody and cytotoxic agent are made using a variety of 

bifunctional protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate 
(SPDP), iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl 
adipimidate HCL), active esters (such as disuccinimidyl suberate), aldehydes (such as 
glutareldehyde), bis-azido compounds (such as bis (p-azidobenzoyl) hexanediamine), bis- 

30 diazonium derivatives (such as bis-(p-diazoniumben2:oyl)-ethylenediamine), diisocyanates 
(such as tolyene 2,6-diisocyanate), and bis-active fluorine compounds (such as 1,5-difluoro- 
2,4-dinitroben2ene). For example, a ricin inmiunotoxin can be prepared as described in 
Vitetta et al.. Science, 238: 1098 (1987). Carbon- 14-labeled l-isothiocyanatobenzyl-S- 
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methyldiethylene triaminepentaacetic acid (MX-DTPA) is an exemplary chelating agent for 
conjugation of radionucleotide to the antibody. See WO94/11026. 

hi another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utiUzation in tumor pretargeting wherein the antibody-receptor conjugate is 
5 administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cjrtotoxic agent. 

NOV-X Recombinant Expression Vectors and Host Cells 

10 " Another aspect of the invention pertains to vectors, preferably expression vectors, 
containing a nucleic acid encoding a NOV-X protein, or derivatives, fragments, analogs or 
homologs thereof. As used herein, the term "vector" refers to a nucleic acid molecule capable 
of transporting another nucleic acid to which it has been linked. One type of vector is a 
"plasmid", which refers to a circular double stranded DNA loop into which additional DNA 

15 segments can be ligated. Another type of vector is a viral vector, wherein additional DNA 
segments can be ligated into the viral genome. Certain vectors are capable of autonomous 
replication in a host cell into which they are introduced (e.g., bacterial vectors having a 
bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., 
non-episomal mammalian vectors) are integrated into the genome of a host cell upon 

20 introduction into the host cell, and thereby are replicated along with the host genome. 

Moreover, certain vectors are capable of directing the expression of genes to which they are 
operatively- linked. Such vectors are referred to herein as "expression vectors". In general, 
expression vectors of utility in recombinant DNA techniques are often in the form of plasmids. 
In the present specification, "plasmid" and "vector" can be used interchangeably as the 

25 plasmid is the most commonly used form of vector. However, the invention is intended to 

include such other forms of expression vectors, such as viral vectors (e.g., replication defective 
retroviruses, adenoviruses and adeno-associated viruses), which serve equivalent functions. 

The recombinant expression vectors of the invention comprise a nucleic acid of the 
invention in a form suitable for expression of the nucleic acid in a host cell, which means that 

30 the recombinant expression vectors include one or more regulatory sequences, selected on the 
basis of the host cells to be used for expression, that is operatively-linked to the nucleic acid 
sequence to be expressed. Within a recombinant expression vector, "operably-linked" is 
intended to mean that the nucleotide sequence of interest is linked to the regulatory 
sequence(s) in a manner that allows for expression of the nucleotide sequence (e.g., in an in 
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vitro transcription/translation system or in a host cell when the vector is introduced into the 
host cell). 

The term "regulatory sequence" is intended to includes promoters, enhancers and other 
expression control elements (e.g., polyadenylation signals). Such regulatory sequences are 
5 described, for example, in Goeddel, GENE EXPRESSION Technology: Methods in 

Enzymology 185, Academic Press, San Diego, Calif. (1990). Regulatory sequences include 
those that direct constitutive expression of a nucleotide sequence in many types of host cell 
and those that direct expression of the nucleotide sequence only in certain host cells (e.g., 
tissue-specific regulatory sequences). It will be appreciated by those skilled in the art that the 
1 0 design of the expression vector can depend on such factors as the choice of the host cell to be 
transformed, the level of expression of protein desired, etc. The expression vectors of the 
invention can be introduced into host cells to thereby produce proteins or peptides, including 
fusion proteins or peptides, encoded by nucleic acids as described herein (e.g., NOV-X 
proteins, mutant forms of NOV-X proteins, fusion proteins, etc.). 
15 The recombinant expression vectors of the invention can be designed for expression of 

NOV-X proteins in prokaryotic or eukaryotic cells. For example, NOV-X proteins can be 
expressed in bacterial cells such as Escherichia coU, insect cells (using baculovims expression 
vectors) yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, 
Gene Expression Technology: Methods in Enzymology 185, Academic Press, San 
20 Diego, Calif (1990). Alternatively, the recombinant expression vector can be transcribed and 
translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase. 

Expression of proteins in prokaryotes is most often carried out in Escherichia coli with 
vectors containing constitutive or inducible promoters directing the expression of either fusion 
or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded 
25 therein, usually to the amino terminus of the recombinant protein. Such fusion vectors 
typically serve three purposes: (i) to increase expression of recombinant protein; (ii) to 
increase the solubility of the recombinant protein; and (iii) to aid in the purification of the 
recombinant protein by acting as a ligand in affinity purification. Often, in fusion expression 
vectors, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the 
30 recombinant protein to enable separation of the recombinant protein from the fusion moiety 
subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition 
sequences, include Factor Xa, thrombin and enteroldnase. Typical fusion expression vectors 
include pGEX (Pharmacia Biotech Inc; Smith and Johnson, 1988. Gene 67: 31-40), pMAL 
(New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) that fuse 
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glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the 
target recombinant protein. 

Examples of suitable inducible non-fusion E. coli expression vectors include pTrc (Amrann et 
al., (1988) Gene 69:301-315) and pET 1 Id (Studier et ah. Gene Expression Technology: 
5 Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990) 60-89). 

One strategy to maximize recombinant protein expression in E. coli is to express the 
protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant 
protein. See, e.g., Gottesman, Gene Expression Technology: Methods in Enzymology 
185, Academic Press, San Diego, Calif. (1990) 1 19-128. Another strategy is to alter the 

10 nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the 
individual codons for each amino acid are those preferentially utihzed in E. coli (see, e.g., 
Wada, et al., 1992. Nucl. Acids Res, 20: 2111-2118). Such alteration of nucleic acid 
sequences of the invention can be carried out by standard DNA synthesis techniques. 

In another embodiment, the NOV-X expression vector is a yeast expression vector. 

15 Examples of vectors for expression in yeast Saccharomyces cerivisae include pYepSecl 

(Baldari, et al., 1987. EMBO J. 6: 229-234), pMFa (Kurjan and Herskowitz, 1982. Cell 30: 
933-943), pJRYSS (Schultz et al., 1987. Gene 54: 1 13-123), pYES2 (Invitrogen Corporation, 
San Diego, Calif.), and picZ (InVitrogen Corp, San Diego, Calif.). 

Alternatively, NOV-X can be expressed in insect cells using baculovirus expression 

20 vectors. Baculovirus vectors available for expression of proteins in cultured insect cells (e.g., 
SF9 cells) include the pAc series (Smith, et ai., 1983. Mol. Cell Biol. 3: 2156-2165) and the 
pVL series (Lucklow and Summers, 1989. Virology 170: 31-39). 

In yet another embodiment, a nucleic acid of the invention is expressed in mammaUan cells 
using a mammalian expression vector. Examples of mammalian expression vectors include 

25 pCDMS (Seed, 1987. Nature 329: 840) and pMT2PC (Kaufinan, et al., 1987. EMBO J. 6: 

187-195). When used in mammalian cells, the expression vector's control functions are often 
provided by viral regulatory elements. For example, commonly used promoters are derived 
from polyoma, adenovims 2, cytomegalovirus, and simian virus 40. For other suitable 
expression systems for both prokaryotic and eukaryotic cells see, e.g.. Chapters 16 and 17 of 

30 Sambrook, et al., MOLECULAR Clondmg: A Laboratory Manual. 2nd ed.. Cold Spring 
Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989. 

In another embodiment, the recombinant mammalian expression vector is capable of 
directing expression of the nucleic acid preferentially in a particular cell type (e.g., 
tissue-specijfic regulatory elements are used to express the nucleic acid). Tissue-specific 
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regulatory elements are known in the art. Non-limiting examples of suitable tissue-specific 
promoters include the albumin promoter (liver-specific; Pinkert, et al., 1987. Genes Dev. 1: 
268-277), lymphoid-specific promoters (Calame and Eaton, 1988. Adv. Immunol. 43: 
235-275), in particular promoters of T cell receptors (Winoto and Baltimore, 1989. EMBO J. 
8: 729-733) and iimmmoglobulms (Banerji, et al., 1983. Cell 33: 729-740; Queen and 
Baltimore, 1983. Cell 33: 741-748), neuron-specific promoters (e.g., the neurofilament 
promoter; Byrne and Ruddle, 1989. Proc. Natl. Acad. Sci, USA 86: 5473-5477), 
pancreas-specific promoters (Edlund, et al., 1985. Science 230: 912-916), and mammary 
gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European 
Application Publication No. 264,166). Developmentally-regulated promoters are also 
encompassed, e.g., the murine hox promoters (Kessel and Gruss, 1990. Science 249: 374-379) 
and the -fetoprotein promoter (Campes and Tilghman, 1989. Genes Dev. 3: 537-546)." 

The invention fiulher provides a recombinant expression vector comprising a DNA 
molecule of the invention cloned into the expression vector in an antisense orientation. That 
is, the DNA molecule is operatively-linked to a regulatory sequence in a manner that allows 
for expression (by transcription of the DNA molecule) of an RNA molecule that is antisense to 
NOV-X mRNA. Regulatory sequences operatively linked to a nucleic acid cloned in the 
antisense orientation can be chosen that direct the continuous expression of the antisense RNA 
molecule in a variety of cell types, for instance viral promoters and/or enhancers, or regulatory 
sequences can be chosen that direct constitutive, tissue specific or cell type specific expression 
of antisense RNA. The antisense expression vector can be in the form of a recombinant 
plasmid, phagemid or attenuated virus in which antisense nucleic acids are produced under the 
control of a high efficiency regulatory region, the activity of which can be determined by the 
cell type into which the vector is introduced. For a discussion of the regulation of gene 
expression using antisense genes see, e.g., Weintraub, et al., "Antisense RNA as a molecular 
tool for genetic analysis," Reviews-Trends in Genetics, Vol. 1(1) 1986. 

Another aspect of the invention pertains to host cells into which a recombinant 
expression vector of the invention has been introduced. The terms "host cell" and 
"recombinant host cell" are used interchangeably herein. It is understood that such terms refer 
not only to the particular subject cell but also to the progeny or potential progeny of such a 
cell. Because certain modifications may occur in succeeding generations due to either 
mutation or environmental influences, such progeny may not, in fact, be identical to the parent 
cell, but are still included within the scope of the term as used herein. 
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A host cell can be any prokaryotic or eukaryotic cell. For example, NOV-X protein 
can be expressed in bacterial cells such as E. coU, insect cells, yeast or mammaUan cells (such 
as human, Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are 
known to those skilled in the art. 
5 Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional 

transformation or transfection techniques. As used herein, the terms "transformation" and 
"transfection" are intended to refer to a variety of art-recognized techniques for introducing 
foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calciiun 
chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or 

1 0 electroporation. Suitable methods for transforming or transfecting host cells can be found in 
Sambrook, et al. (MOLECULAR CLONING: A LABORATORY Manual. 2nd ed.. Cold Spring 
Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y,, 1989), 
and other laboratory manuals. 

For stable transfection of manomalian cells, it is known that, depending upon the 

1 5 expression vector and transfection technique used, only a small fraction of cells may integrate 
the foreign DNA into their genome. In order to identify and select these integrants, a gene that 
encodes a selectable marker (e.g., resistance to antibiotics) is generally introduced into the 
host cells along with the gene of interest. Various selectable markers include those that confer 
resistance to drugs, such as G418, hygromycin and methotrexate. Nucleic acid encoding a 

20 selectable marker can be introduced into a host cell on the same vector as that encoding NOV- 
X or can be introduced on a separate vector. Cells stably transfected with the introduced 
nucleic acid can be identified by drug selection (e.g., cells that have incorporated the 
selectable marker gene will survive, while the other cells die). 

A host cell of the invention, such as a prokaryotic or eukaryotic host cell in culture, can 

25 be used to produce (i.e., express) NOV-X protein. Accordingly, the invention further provides 
methods for producing NOV-X protein using the host cells of the invention, hi one 
embodiment, the method comprises culturing the host cell of invention (into which a 
recombinant expression vector encoding NOV-X protein has been introduced) in a suitable 
mediimi such that NOV-X protein is produced. In another embodiment, the method further 

30 comprises isolating NOV-X protein firom the medium or the host cell. 

Transgenic NOV-X Animals 

The host cells of the invention can also be used to produce non-human transgenic 
animals. For example, in one embodiment, a host cell of the invention is a fertilized oocyte or 
an embryonic stem cell into which NOV-X protein-coding sequences have been introduced. 

131 

BNSOOCIO: <WO__016292eA2J_> 



wo 01/62928 PCTAJSOl/06151 
Such host cells can then be used to create non-human transgenic animals in which exogenous 
NOV-X sequences have been introduced into their genome or homologous recombinant 
animals in which endogenous NOV-X sequences have been altered. Such animals are useful 
for studying the function and/or activity of NOV-X protein and for identifying and/or 
5 evaluating modulators of NOV-X protein activity. As used herein, a "transgenic animal" is a 
non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in 
which one or more of the cells of the animal includes a transgene. Other examples of 
transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, 
amphibians, etc, A ti'ansgene is exogenous DNA that is integrated into the genome of a cell 

10 from which a transgenic animal develops and that remains in the genome of the mature 

animal, thereby directing the expression of an encoded gene product in one or more cell types 
or tissues of the transgenic animal. As used herein, a "homologous recombinant animal is a 
non-human animal, preferably a mammal, more preferably a mouse, in which an endogenous 
NOV-X gene has been altered by homologous recombination between the endogenous gene 

15 and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell 
of the animal, prior to development of the animal. 

A transgenic animal of the invention can be created by introducing NOV-X-eiicoding 
nucleic acid into the male pronuclei of a fertilized oocyte (e.g., by microinjection, retroviral 
infection) and allowing the oocyte to develop in a pseudopregnant female foster animal. 

20 Sequences including SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57 can be 

introduced as a transgene into the genome of a non-human animal. Alternatively, a non- 
human homologue of the human NOV-X gene, such as a mouse NOV-X gene, can be isolated 
based on hybridization to the human NOV-X cDNA (described further supra) and used as a 
transgene. Intronic sequences and polyadenylation signals can also be included in the 

25 transgene to increase the efficiency of expression of the transgene. A tissue-specific 

regulatory sequence(s) can be operably-linked to the NOV-X transgene to direct expression of 
NOV-X protein to particular cells. Methods for generating transgenic animals via embryo 
manipulation and microinjection, particularly animals such as mice, have become 
conventional in the art and are described, for example, in U.S. Patent Nos. 4,736,866; 

30 4,870,009; and 4,873,191 ; and Hogan, 1986. In: MANIPULATING THE MOUSE EMBRYO, Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. Similar methods are used for 
production of other transgenic animals. A transgenic founder animal can be identified based 
upon the presence of the NOV-X transgene in its genome and/or expression of NOV-X mRNA 
in tissues or cells of the animals. A transgenic founder animal can then be used to breed 
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additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene- 
encoding NOV-X protein can further be bred to other transgenic animals canrying other 
transgenes. 

To create a homologous recombinant animal, a vector is prepared which contains at 
5 least a portion of a NOV-X gene into which a deletion, addition or substitution has been 

introduced to thereby alter, e.g., functionally disrupt, the NOV-X gene. The NOV-X gene can 
be a human gene (e.g., the DNA of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57), 
but more preferably, is a non-human homologue of a human NOV-X gene. For example, a 
mouse homologue of human NOV-X gene of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 
10 22, or 57 can be used to construct a homologous recombination vector suitable for altering an 
endogenous NOV-X gene in the mouse genome. In one embodiment, the vector is designed 
such that, upon homologous recombination, the endogenous NOV-X gene is functionally 
disrupted (i.e., no longer encodes a functional protein; also referred to as a "knock out" 
vector). 

15 Altematively, the vector can be designed such that, upon homologous recombination, 

the endogenous NOV-X gene is mutated or otherwise altered but still encodes functional 
protein (e.g., the upstream regulatory region can be altered to thereby alter the expression of 
the endogenous NOV-X protein). In the homologous recombination vector, the altered portion 
of the NOV-X gene is flanked at its 5 - and 3'-termini by additional nucleic acid of the NOV-X 

20 gene to allow for homologous recombination to occur between the exogenous NOV-X gene 
carried by the vector and an endogenous NOV-X gene in an embryonic stem cell. The 
additional flanking NOV-X nucleic acid is of sufficient length for successful homologous 
recombination with the endogenous gene. Typically, several kilobases of flanking DNA (both 
at the 5'- and 3 -termini) are included in the vector. See, e.g., Thomas, et al., 1987. Cell 51: 

25 503 for a description of homologous recombination vectors. The vector is ten introduced into 
an embryonic stem cell line (e.g., by electroporation) and cells in which the introduced NOV- 
X gene has homologously-recombined with the endogenous NOV-X gene are. selected. See, 
e.g., Li, et al., 1992. Cell 69: 915. 

The selected cells are then injected into a blastocyst of an animal (e.g., a mouse) to 

30 form aggregation chimeras. See, e.g., Bradley, 1987. In: Teratocarcinomas AND 

Embryonic Stem Cells: A Practical Approach, Robertson, ed. IRL, Oxford, pp. 1 13-152. 
A chimeric embryo can then be implanted into a suitable pseudopregnant female foster animal 
and the embryo brought to term. Progeny harboring the homologously-recombined DNA in 
their germ cells can be used to breed animals in which all cells of the animal contain the 
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homologously-recombined DNA by germline transmission of the transgene. Methods for 
constructing homologous recombination vectors and homologous recombinant animals are 
described further in Bradley, 1991. Curr. Opin. Biotechnol. 2: 823-829; PCT International 
Publication Nos.: WO 90/1 1354; WO 91/01 140; WO 92/0968; and WO 93/04169. 
5 In another embodiment, transgenic non-humans animals can be produced that contain 

selected systems that allow for regulated expression of the transgene. One example of such a 
system is the cre/loxP recombinase system of bacteriophage PI. For a description of the 
cre/loxP recombinase system, See, e.g., Lakso, et al., 1992. Proc. Natl. Acad. Sci. USA 89: 
6232-6236. Another example of a recombinase system is the FLP recombinase system of 

10 Saccharomyces cerevisiae. See, O'Gorman, et al., 1991. Science 251:1351-1355. If a cre/loxP 
recombinase system is used to regulate expression of the transgene, animals containing 
transgenes encoding both the Cre recombinase and a selected protein are required. Suchr 
animals can be provided through the construction of "double" transgenic animals, e.g., by 
mating two transgenic animals, one containing a transgene encoding a selected protein and the 

15 other containing a transgene encoding a recombinase. 

Clones of the non-human transgenic animals described herein can also be produced 
according to the methods described in Wilmut, et al., 1997. Nature 385: 810-813. In brief, a 
cell (e.g., a somatic cell) from the transgenic animal can be isolated and induced to exit the 
growth cycle and enter Go phase. The quiescent cell can then be fiised, e.g., through the use of 

20 electrical pulses, to an enucleated oocyte from an animal of the same species from which the 
quiescent cell is isolated. The reconstructed oocyte is then cultured such that it develops to 
momla or blastocyte and then transferred to pseudopregnant female foster animal. The 
offspring borne of this female foster animal will be a clone of the animal from which the cell 
(e.g., the somatic cell) is isolated. 

25 

Pharmaceutical Compositions 

The NOV-X nucleic acid molecules, NOV-X proteins, and anti-NOV-X antibodies 
(also referred to herein as "active compounds") of the invention, and derivatives, fragments, 
analogs and homologs thereof, can be incorporated into pharmaceutical compositions suitable 
30 for administration. Such compositions typically comprise the nucleic acid molecule, protein, 
or antibody and a pharmaceutically acceptable carrier. As used herein, "pharmaceutically 
acceptable carrier" is intended to include any and all solvents, dispersion media, coatings, 
antibacterial and antifrmgal agents, isotonic and absorption delajdng agents, and the like, 
compatible with pharmaceutical administration. Suitable carriers are described in the most 

134 



BNSOCCIO: <WO 0162928A2.I_> 



wo 01/62928 PCT/USOl/06151 
recent edition of Remington^s Phannaceutical Sciences, a standard reference text in the field, 
which is incorporated herein by reference. Preferred examples of such carriers or diluents 
include, but are not limited to, water, saline, finger's solutions, dextrose solution, and 5% 
human serum albumin. Liposomes and non-aqueous vehicles such as fixed oils may also be 
5 used. The use of such media and agents for phaimaceutically active substances is weU known 
in: the art. Except insofar as any conventional media or agent is incompatible with the active 
compound, use thereof in the compositions is contemplated. Supplementary active 
compounds can also be incorporated into the compositions. 

The antibodies disclosed herein can also be formulated as immunoliposomes. 

10 Liposomes containing the antibody are prepared by methods known in the art, such as 

described in Epstem et al., Proc. Natl. Acad. Sci. USA, 82: 3688 (1985); Hwang et al., Proc. 
Natl Acad. Sci. USA, 77: 4030 (1980); and U.S. Pat. Nos. 4,485,045 and 4,544,545. ^ 
Liposomes with enhanced circulation time are disclosed in U.S. Patent No. 5,013,556. 

Particularly useful liposomes can be generated by the reverse-phase evaporation 

15 method with a lipid composition comprising phosphatidylcholine, cholesterol, and PEG- 

derivatized phosphatidylethanolamine (PEG-PE). Liposomes are extruded through filters of 
defined pore size to yield liposomes with the desired diameter. Fab* fragments of the antibody 
of the present invention can be conjugated to the liposomes as described in Martin et al J. 
Biol. Chem., 257: 286-288 (1982) via a disulfide-interchange reaction. A chemotherapeutic 

20 agent (such as Doxombicin) is optionally contained within the liposome. See Gabizon et al., J. 
National Cancer Inst., 81(19): 1484 (1989). 

A pharmaceutical composition of the invention is formulated to be compatible with its 
intended route of administration. Examples of routes of administration include parenteral, 
e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (i.e., topical), 

25 transmucosal, and rectal administration. Solutions or suspensions used for parenteral, 

intradermal, or subcutaneous application can include the following components: a sterile 
diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, 
propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or 
methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such 

30 as ethylenediaminetetraacetic acid (EDTA); buffers such as acetates, citrates or phosphates, 
and agents for the adjustment of tonicity such as sodium chloride or dextrose. The pH can be 
adjusted with acids or bases, such as hydrochloric acid or sodiimi hydroxide. The parenteral 
preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of 
glass or plastic. 
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Pharmaceutical compositions suitable for injectable use include sterile aqueous 
solutions (where water soluble) or dispersions and sterile powders for the extemporaneous 
preparation of sterile injectable solutions or dispersion. For intravenous administration, 
suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, 
5 Parsippany, N. J.) or phosphate buffered saline (PBS). In all cases, the composition must be 
sterile and should be fluid to the extent that easy syringeability exists. It must be stable under 
the conditions of manufacture and storage and must be preserved against the contaminating 
action of microorganisms such as bacteria and fimgi. The carrier can be a solvent or 
dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, 

10 propylene glycol, and liquid polyethylene glycol, and the like), and suitable mixtures thereof. 
The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by 
the maintenance of the required particle size in the case of dispersion and by the use of ^ 
surfactants. Prevention of the action of microorganisms can be achieved by various 
antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic 

15 acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, 
for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the 
composition. Prolonged absorption of the injectable compositions can be brought about by 
includuig in the composition an agent which delays absorption, for example, aluminum 
monostearate and gelatin. 

20 Sterile injectable solutions can be prepared by incorporating the active compound (e.g., 

a NOV-X protein or anti-NOV-X antibody) in the required amount in an appropriate solvent 
with one or a combination of ingredients enxmierated above, as required, followed by filtered 
sterilization. Generally, dispersions are prepared by incorporating the active compound into a 
sterile vehicle that contains a basic dispersio n me and the required other ingredients from 

25 those enumerated above. In the case of sterile powders for the preparation of sterile injectable 
solutions, methods of preparation are vacuum drying and fireeze-drying that yields a powder of 
the active ingredient plus any additional desired ingredient firom a previously sterile-filtered 
solution thereof. 

Oral compositions generally include an inert diluent or an edible carrier. They can be 
30 enclosed in gelatin capsules or compressed into tablets. For the purpose of oral therapeutic 

administration, the active compoimd can be incorporated with excipients and used in the form 
of tablets, troches, or capsules. Oral compositions can also be prepared using a fluid carrier 
for use as a mouthwash, wherein the compound in the fluid carrier is applied orally and 
swished and expectorated or swallowed. Pharmaceutically compatible binding agents, and/or 
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adjuvant materials can be included as part of the composition. The tablets, pills, capsules, 
troches and the like can contain any of the following ingredients, or compoimds of a similar 
nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient 
such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or com starch; a 
5 lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a 
sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, 
methyl salicylate, or orange flavoring. 

For administration by inhalation, the compounds are delivered in the fomi of an 
aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., . 

10 a.gas such as carbon dioxide, or a nebulizer. 

Systemic administration can also be by transmucosal or transdermal means. For transmucosal 
or transdermal administration, penetrants appropriate to the barrier to be permeated are tised in 
the formulation. Such penetrants are generally known in the art, and include, for example, for 
transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal 

15 administration can be accomplished through the use of nasal sprays or suppositories. For 

transderaial administration, the active compounds are formulated into ointments, salves, gels, 
or creams as generally known in the art. 

The compoimds can also be prepared in the form of suppositories (e.g., with 
conventional suppository bases such as cocoa butter and other glycerides) or retention enemas 

20 for rectal delivery. 

In one embodiment, the active compounds are prepared with carriers that will protect 
the compound against rapid elimination from the body, such as a controlled release 
formulation, including implants and microencapsulated delivery systems. Biodegradable, 
biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, 

25 polyglycolic acid, collagen, polyortho esters, and polylactic acid. Methods for preparation of 
such formulations will be apparent to those skilled in the art. The materials can also be 
obtained conunercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal 
suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral 
antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared 

30 according to methods known to those skilled in the art, for example, as described in U.S. 
Patent No. 4,522,811. 

It is especially advantageous to formulate oral or parenteral compositions in dosage imit form 
for ease of administration and xmiformity of dosage. Dosage unit form as used herein refers to 
physically discrete units suited as unitary dosages for the subject to be treated; each unit 
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containing a predetermined quantity of active compound calculated to produce the desired 
therapeutic effect in association with the required pharmaceutical carrier. The specification 
for the dosage unit forms of the invention are dictated by and directly dependent on the 
unique characteristics of the active compound and the particular therapeutic effect to be 
5 achieved, and the limitations inherent in the art of compoimding such an active compound for 
the treatment of individuals. 

The nucleic acid molecules of the invention can be inserted into vectors and used as 
gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, 
intravenous injection, local administration (see, e.g., U.S. Patent No. 5,328,470) or by 

10 stereotactic injection (see, e.g., Chen, et al., 1994. Proc. Natl. Acad. Sci. USA 91: 3054-3057). 
The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector 
in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery 
vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced 
intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can 

15 include one or more cells that produce the gene delivery system. 

Antibodies specifically binding a protein of the invention, as well as other molecules 
identified by the screening assays disclosed herein, can be administered for the treatment of 
various disorders in the form of pharmaceutical compositions. Principles and considerations 
involved in preparing such compositions, as well as guidance in the choice of components are 

20 provided, for example, in Remington : The Science And Practice Of Pharmacy 19th ed. 
(Alfonso R. Gennaro, et al., editors) Mack Pub. Co., Easton, Pa. : 1995; Drug Absorption 
Enhancement : Concepts, Possibilities, Limitations, And Trends, Harwood Academic 
Publishers, Langhome, Pa., 1994; and Peptide And Protein Drug Delivery (Advances In 
Parenteral Sciences, Vol. 4), 1991, M. Dekker, New York^ 

25 intracellular and whole antibodies are used as inhibitors, internalizing antibodies are preferred. 
However, liposomes can also be used to deliver the antibody, or an antibody fragment, into 
cells. Where antibody fragments are used, the smallest inhibitory fragment that specifically 
binds to the binding domain of the target protein is preferred. For example, based upon the 
-variable-region sequences of an antibody, peptide molecules can be designed that retain the 

30 ability to bind the target protein sequence. Such peptides can be synthesized chemically 

and/or produced by recombinant DNA technology. See, e.g., Marasco et al., 1993 Proc. Natl. 
Acad. Sci. USA, 90: 7889-7893. The formulation herein can also contain more than one 
active compound as necessary for the particular indication being treated, preferably those with 
complementary activities that do not adversely affect each other. Alternatively, or in addition, 
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the composition can comprise aa agent that enhances its function, such as, for example, a 
cytotoxic agent, cytokine, chemotherapeutic agent, or growth-inhibitory agent. Such 
molecules are suitably present in combination in amounts that are effective for the purpose 
intended. The active ingredients can also be entrapped in microcapsules prepared, for 
5 example, by coacervation techniques or by interfacial polymerization, for example, 
hydroxymethylcellulose or gelatin-microcapsules and poly-(methylmethaciylate) 
microcapsules, respectively, in colloidal drug deUvery systems (for example, liposomes, 
albumin microspheres, microemulsions, nano-particles, and nanocapsules) or in 
macroemulsions. 

10 The formulations to be used for in vivo adnunistration must be sterile. This is readily 

accomplished by filtration through sterile filtration membranes. 

Sustained-release preparations can be prepared. Suitable examples of sustained-release 
preparations include semipermeable matrices of solid hydrophobic polymers containing the 
antibody, which matrices are in the form of shaped articles, e.g., films, or microcapsules. 

15 Examples of sustained-release matrices include polyesters, hydrogels (for example, poly(2- 
hydroxyethyl-methacrylate), or poly(vinylalcohol)), polylactides (U.S. Pat. No. 3,773,919), 
copolymers of L-glutamic acid and ethyl-L-glutamate, non-degradable ethylene-vinyl 
acetate, degradable lactic acid-glycolic acid copolymers such as the LUPRON DEPOT ™ 
(injectable microspheres composed of lactic acid-glycolic acid copolymer and leuprolide 

20 acetate), and poly-D-(-)-3-hydroxybutyric acid. While polymers such as ethylene-vinyl 

acetate and lactic acid-glycolic acid enable release of molecules for over 100 days, certain 
hydrogels release proteins for shorter time periods. 

The pharmaceutical compositions can be included in a container, pack, or dispenser 
together with instructions for administration. 

25 

Screening and Detection Methods 

The isolated nucleic acid molecules of the invention can be used to express NOV-X 
protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), 
to detect NOV-X mRNA (e.g., in a biological sample) or a genetic lesion in a NOV-X gene, 
30 and to modulate NOV-X activity, as described further, below. In addition, the NOV-X 

proteins can be used to screen dmgs or compounds that modulate the NOV-X protein activity 
or expression as well as to treat disorders characterized by insufficient or excessive production 
of NOV-X protein or production of NOV-X protein forms that have decreased or aberrant 
activity compared, to NOV-X wild-type protein. In addition, the anti-NOV-X antibodies of the 
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invention can be used to detect and isolate NOV-X proteins and modulate NOV-X activity. 
For example, NOV-X activity includes growth and differentiation, antibody production, and 
tumor growth. 

The invention further pertains to novel agents identified by the screening assays 
described herein and uses thereof for treatments as described, supra. 

Screening Assays 

The invention provides a method (also referred to herein as a "screening assay") for 
identifying modulators, i.e., candidate or test compounds or agents (e.g., peptides, 
peptidomimetics, small molecules or other drugs) that bind to NOV-X proteins or have a 
stimulatory or inhibitory effect on, e.g., NOV-X protein expression or NOV-X protein activity. 
The invention also includes compounds identified in the screening assays described herein. 

In one embodiment, the invention provides assays for screening candidate or test 
compounds which bind to or modulate the activity of the membrane-bound form of a NOV-X 
protein or polypeptide or biologically-active portion thereof. The test compounds of the 
invention can be obtained using any of the numerous approaches in combinatorial library 
methods known in the art, including: biological libraries; spatially addressable parallel solid 
phase or solution phase libraries; synthetic library methods requiring deconvolution; the 
"one-bead one-compoimd" library method; and synthetic library methods using affinity 
chromatography selection. The biological library approach is limited to peptide libraries, 
while the other four approaches are applicable to peptide, non-peptide oligomer or small 
molecule libraries of compounds. See, e.g.. Lam, 1997. Anticancer Drug Design 12: 145. 

A "small molecule" as used herein, is meant to refer to a composition that has a 
molecular weight of less than about 5 kD and most preferably less than about 4 kD. Small 
molecules can be, e.g., nucleic acids, peptides, polypeptides, peptidomimetics, carbohydrates, 
lipids or other organic or inorganic molecules. Libraries of chemical and/or biological 
mixtures, such as fungal, bacterial, or algal extracts, are known in the art and can be screened 
with any of the assays of the invention. 

Examples of methods for the synthesis of molecular libraries can be found in the art, 
for example in: DeWitt, et al., 1993. Proc. Natl. Acad. Sci. U.S.A. 90: 6909; Erb, et al., 1994. 
Proc. Natl. Acad. Sci. U.S.A. 91: 11422; Zuckermann, et al., 1994. J. Med. Chem. 37: 2678; 
Cho, et al., 1993. Science 261: 1303; Carrell, et al., 1994. Angew. Chem. Int. Ed. Engl. 33: 
2059; Carell, et al., 1994. Angew. Chem. Int. Ed. Engl. 33: 2061; and Gallop, et al., 1994. J. 
Med. Chem. 37: 1233. 
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Libraries of compounds may be presented in solution (e.g., Houghten, 1992. 
Biotechniques 13: 412-421), or on beads (Lam, 1991. Nature 354: 82-84), on chips (Fodor, 
1993. Nature 364: 555-556), bacteria (Ladner, U.S. Patent No. 5,223,409), spores (Ladner, 
U.S. Patent 5,233,409), plasmids (Cull, et al., 1992. Proc. Natl. Acad. Sci. USA 89: 
5 1865-1869) or on phage (Scott and Smith, 1990, Science 249: 386-390; Devlin, 1990. Science 
249: 404-406; Cwirla, et al., 1990. Proc. Natl. Acad. Sci. U.S.A. 87: 6378-6382; Felici, 1991. 
J. Mol. Biol. 222: 301-310; Ladner, U.S. Patent No. 5,233,409.). 

In one embodiment, an assay is a cell-based assay in which a cell which expresses a 
membrane-boimd form of NOV-X protein, or a biologically-active portion thereof, on the cell 

10 surface is contacted with a test compound and the ability of the test compoimd to bind to a 
NOV-X protein determined. The cell, for example, can be of mammalian origin or a yeast 
cell. Determining the ability of the test compound to bind to the NOV-X protein can be^ 
accomplished, for example, by coupling the test compound with a radioisotope or enzymatic 
label such that binding of the test compoxmd to the NOV-X protein or biologically-active 

15 portion thereof can be determined by detecting the labeled compound in a complex. For 

example, test compounds can be labeled with ^^^I, ^% ^^^C, or ^H, either directly or indirectly, 
and the radioisotope detected by direct coimting of radioemission or by scintillation coimting. 
Altematively, test compounds can be enzymatically-labeled with, for example, horseradish 
peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by 

20 determination of conversion of an appropriate substrate to product. In one embodiment, the 
assay comprises contacting a cell which expresses a membrane-boxmd fomi of NOV-X 
protein, or a biologically-active portion thereof, on the cell surface with a known compoimd 
which binds NOV-X to form an assay mixture, contacting the assay mixture with a test 
compoimd, and determining the ability of the test compound to interact with a NOV-X protein, 

25 wherein determining the ability of the test compound to interact with a NOV-X protein 
comprises determining the ability of the test compound to preferentially bind to NOV-X 
protein or a biologically-active portion thereof as compared to the known compound. 

In another embodiment, an assay is a cell-based assay comprising contacting a cell 
expressing a membrane-bound form of NOV-X protein, or a biologically-active portion 

30 thereof, on the cell surface with a test compound and determining the ability of the test 
compound to modulate (e.g., stimulate or inhibit) the activity of the NOV-X protem or 
biologically-active portion thereof. Determining the ability of the test compound to modulate 
the activity of NOV-X or a biologically-active portion thereof can be accomplished, for 
example, by determining the ability of the NOV-X protein to bind to or interact with a NOV-X 
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target molecule. As used herein, a "target molecule" is a molecule with which a NOV-X 
protein binds or interacts in nature, for example, a molecule on the surface of a cell which 
expresses a NOV-X interacting protein, a molecule on the surface of a second cell, a molecule 
in the extracellular milieu, a molecule associated with the internal . surface of a cell membrane 

5 or a cytoplasmic molecule. A NOV-X target molecule can be a non-NOV-X molecule or a 
NOV-X protein or polypeptide of the invention In one embodiment, a NOV-X target 
molecule is a component of a signal transduction pathway that facilitates transduction of an 
extracellular signal (e.g. a signal generated by binding of a compound to a membrane-bound 
NOV-X molecule) through the cell membrane and into the cell. The target, for example, can 

10 be a second intercellular protein that has catalytic activity or a protein that facilitates the 
association of downstream signaling molecules with NOV-X. 

Determining the ability of the NOV-X protein to bind to or interact with a NOV-"X 
target molecule can be accomplished by one of the methods described above for determining 
direct binding. 

15 In one embodiment, determining the ability of the NOV-X protein to bind to or 

interact with a NOV-X target molecule can be accompUshed by determining the activity of the 
target molecule. For example, the activity of the target molecule can be determined by 
detecting induction of a cellular second messenger of the target (i.e. intracellular Ca^*, 
diacylglycerol, IP3, etc.), detecting catalytic/enzymatic activity of the target an appropriate 

20 substrate, detecting the induction of a reporter gene (comprising a NOV-X-responsive 

regulatory element operatively linked to a nucleic acid encoding a detectable marker, e.g., 
luciferase), or detecting a cellular response, for example, cell survival, cellular differentiation, 
or cell proliferation. 

In yet another embodiment an assay of the invention is a cell-free assay comprising 
25 contacting a NOV-X protein or biologically-active portion thereof with a test compound and 
determining the abiUty of the test compound to bind to the NOV-X protein or biologically- 
active portion thereof. Binding of the test compound to the NOV-X protein can be detennined 
either directly or indirectly as described above. 

In one such raibodiment, the assay comprises contacting the NOV-X protein or 
30 biologically-active portion thereof with a known compound which binds NOV-X to form an 
assay mixture, contacting the assay mixture with a test compound, and determining the abiUty 
of the test compound to interact with a NOV-X protein, wherein determining the ability of the 
test compound to interact with a NOV-X protein comprises determining the ability of the test 
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compound to preferentially bind to NOV-X or biologically-active portion thereof as compared 
to the known compound. 

In still another embodiment, an assay is a cell-free assay comprising contacting NOV- 
X protein or biologically-active portion thereof with a test compound and determining the 
5 abiUty of the test compound to modulate (e.g. stimulate or inhibit) the activity of the NOV-X 
protein or biologically-active portion thereof. Determining the ability of the test compound to 
modulate the activity of NOV-X can be accomplished, for example, by determining the ability 
of the NOV-X protein to bind to a NOV-X target molecule by one of the methods described 
above for determining direct binding. In an alternative embodiment, determining the ability of 

1 0 the test compound to modulate the activity of NOV-X protein can be accomplished by 

detemiining the ability of the NOV-X protein further modulate a NOV-X target molecule. For 
example, the catalytic/enzymatic activity of the target molecule on an appropriate substrate 
can be determined as described above. 

In yet another embodiment, the cell-free assay comprises contacting the NOV-X 

15 protein or biologically-active portion thereof with a known compoimd which binds NOV-X 
protein to form an assay mixtiure, contacting the assay mixture with a test compound, and 
detemiining the ability of the test compound to interact with a NOV-X protein, wherein 
detemiining the ability of the test compound to interact with a NOV-X protein comprises 
determining the abiUty of the NOV-X protein to preferentially bind to or modulate the activity 

20 of a NOV-X target molecule. 

The cell-free assays of the invention are amenable to use of both the soluble form or 
the membrane-bound form of NOV-X protein. In the case of cell-free assays comprising the 
membrane-bound form of NOV-X protein, it may be desirable to utilize a solubilizing agent 
such that the membrane-bound form of NOV-X protein is maintained in solution. Examples 

25 of such solubilizing agents include non-ionic detergents such as n-octylglucoside, 
n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, 
decanoyl-N-methylglucamide, Triton® X-100, Triton® X-1 14, Thesit®, 
Isotridecypoly(ethylene glycol ether)n, N-dodecyl-N,N-dimethyl-3-ammonio-l-propane 
sulfonate, 3-(3-cholamidopropyl) dimethylamminioM -propane sulfonate (CHAPS), or 

30 3-(3-cholamidopropyl)dimethylamminiol-2-hydroxy- 1 -propane sulfonate (CHAPSO). 

In more than one embodiment of the above assay methods of the invention, it may be 
desirable to immobilize either NOV-X protein or its target molecule to facilitate separation of 
complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate 
automation of the assay. Binding of a test compound to NOV-X protein, or interaction of 

143 



8NS0OCI0: <WO 0162928A2.I_> 



wo 01/62928 PCT/USOl/06151 

NOV-X protein with a target molecule in the presence and absence of a candidate compound, 
can be accomplished in any vessel suitable for containing the reactants. Examples of such 
vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a 
fusion protein can be provided that adds a domain that allows one or both of the proteins to be 
bound to a matrix. For example, GST-NOV-X fusion proteins or GST-target fusion proteins 
can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, MO) or 
glutathione derivatized microtiter plates, that are then combined with the test compound or the 
test compound and either the non-adsorbed target protein or NOV-X protein, and the mixture 
is incubated under conditions conducive to complex formation (e.g., at physiological 
conditions for salt and pH). Following incubation, the beads or microtiter plate wells are 
washed to remove any imbound components, the matrix immobilized in the case of beads, 
complex determined either directly or indirectly, for example, as described, supra. ^ 
Altematively, the complexes can be dissociated from the matrix, and the level of NOV-X 
protein binding or activity determined using standard techniques. 

Other techniques for immobilizing proteins on matrices can also be used in the 
screenmg assays of the invention. For example, either the NOV-X protein or its target 
molecule can be inmiobilized utilizmg conjugation of biotin and streptavidin. Biotinylated 
NOV-X protein or target molecules can be prepared from biotin-NHS 
(N-hydroxy-succinimide) using techniques well-known within the art (e.g., biotinylation kit. 
Pierce Chemicals, Rockford, III.), and immobilized in the wells of streptavidin-coated 96 well 
plates (Pierce Chemical). Altematively, antibodies reactive with NOV-X protein or target 
molecules, but which do not interfere with bindmg of the NOV-X protein to its target 
molecule, can be derivatized to the wells of the plate, and unboimd target or NOVtX protein 
trapped in the wells by antibody conjugation. Methods for detecting such complexes, in 
addition to those described above for the GST-immobilized complexes, include . 
immunodetection of complexes using antibodies reactive with the NOV-X protein or target 
molecule, as well as enzyme-Unked assays that rely on detecting an enzymatic activity 
associated with the NOV-X protein or target molecule. 

In another embodiment, modulators of NOV-X protein expression are identified in a 
method wherein a cell is contacted with a candidate compound and the expression of NOV-X 
mRNA or protein in the cell is determined. The level of expression of NOV-X mRNA or 
protein in the presence of the candidate compound is compared to the level of expression of 
NOV-X mRNA or protein in the absence of the candidate compoimd. The candidate 
compound can then be identified as a modulator of NOV-X mRNA or protein expression 
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based upon this comparison. For example, when expression of NOV-X mRNA or protein is 
greater (i.e., statistically significantly greater) in the presence of the candidate compound than 
in its absence, the candidate compound is identified as a stimulator of NOV-X mRNA or 
protein expression. Alternatively, when expression of NOV-X mRNA or protein is less 
(statistically significantly less) in the presence of the candidate compound than in its absence, 
the candidate compound is identified as an inhibitor of NOV-X mRNA or protein expression. 
The level of NOV-X mRNA or protein expression in the ceUs can be determined by methods 
described herein for detecting NOV-X mRNA or protein. 

In yet another aspect of the invention, the NOV-X proteins can be used as "bait 
-proteins" in a two-hybrid assay or three hybrid assay (see, e.g., U.S. Patent No. 5,283,317; 
Zervos, et al., 1993. Cell 72: 223-232; Madura, et al., 1993. J. Biol. Chem. 268: 12046-12054; 
"Bartel, et al., 1993. Biotechniques 14: 920-924; Iwabuchi, et al., 1993. Oncogene 8: ^ 
1693-1696; and Brent WO 94/10300), to identify other proteins that bind to or interact with 
NOV-X C'NOV-X-binding proteins" or "NOV-X-bp") and modulate NOV-X activity. Such 
NOV-X-binding proteins are also likely to be involved in the propagation of signals by the 
NOV-X proteins as, for example, upstream or downstream elements of the NOV-X pathway. 

The two-hybrid system is based on the modular nature of most transcription factors, 
which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes 
two different DNA constructs. In one construct, the gene that codes for NOV-X is fused to a 
gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the 
other construct, a DNA sequence, from a library of DNA sequences, that encodes an 
unidentified protein ("prey" or "sample") is fused to a gene that codes for the activation 
domain of the known transcription factor. If the "bait" and the "prey" proteins are able to 
interact, in vivo, forming a NOV-X-dependent complex, the DNA-binding and activation 
domains of the transcription factor are brought into close proximity. This proximity allows 
transcription of a reporter gene (e.g., LacZ) that is operably linked to a transcriptional 
regulatory site responsive to the transcription factor. Expression of the reporter gene can be 
detected and cell colonies containing the functional transcription factor can be isolated and 
used to obtain the cloned gene that encodes the protein which interacts with NOV-X. 

The invention further pertains to novel agents identified by the aforementioned 
screening assays and uses thereof for treatments as described herein. 
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Detection Assays 

Portions or fragments of the cDNA sequences identified herein (and the corresponding 
complete gene sequences) can be used in numerous ways as poljmucleotide reagents. By way 
5 of example, and not of limitation, these sequences can be used to: (i) identify an individual 
from a minute biological sample (tissue typing); and (ii) aid in forensic identification of a 
biological sample. Some of these applications are described in the subsections, below. 

Tissue Typing 

The NOV-X sequences of the invention can be used to identify individuals from minute 

10 biological samples. In this technique, an individual's genomic DNA is digested with one or 
more restriction enzymes, and probed on a Southern blot to yield imique bands for ^ 
identification. The sequences of the invention are usefixl as additional DNA markers for RFLP 
("restriction fragment length polymorphisms," described in U.S. Patent No. 5,272,057). 
Fiuthermore, the sequences of the invention can be used to provide an alternative 

15 technique that determines the actual base-by-base DNA sequence of selected portions of an 
individual's genome. Thus, the NOV-X sequences described herein can be used to prepare 
two PGR primers from the 5*- and 3 '-termini of the sequences. These primers can then be used 
to amplify an individual's DNA and subsequently sequence it. 

Panels of corresponding DNA sequences from individuals, prepared in this manner, 

20 can provide xmique individual identifications, as each individual will have a unique set of such 
DNA sequences due to allelic differences. The sequences of the invention can be used to 
obtain such identification sequences fi*om individuals and from tissue. The NOV-X sequences 
of the invention uniquely represent portions of the himian genome. Allelic variation occurs to 
some degree in the coding regions of these sequences, and to a greater degree in the noncoding 

25 regions. It is estimated that allelic variation between individual humans occurs with a 

frequency of about once per each 500 bases. Much of the allelic variation is due to single 
nucleotide polymorphisms (SNPs), which include restriction fragment length polymorphisms 
(RFLPs). 

Each of the sequences described herein can, to some degree, be used as a standard 
30 against which DNA from an individual can be compared for identification purposes. Because 
greater niunbers of polymorphisms occur in the noncoding regions, fewer sequences are 
necessary to differentiate individuals. The noncoding sequences can comfortably provide 
positive individual identification with a panel of perhaps 10 to 1,000 primers that each yield a 
noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in 
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SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57 are xised, a more appropriate number of 
primers for positive individual identification would be 500-2,000. 



Predictive Medicine 

5 The invention also pertains to the field of predictive medicine in which diagnostic 

assays, prognostic assays, pharmacogenomics, and monitoring clinical trials are used for 
prognostic (predictive) purposes to thereby treat an individual prophylactically. Accordingly, 
one aspect of the invention relates to diagnostic assays for determining NOV-X protein and/or 
nucleic acid expression as well as NOV-X activity, in the context of a biological sample (e.g., 

10 blood, serum, cells, tissue) to thereby determine whether an individual is afflicted with a 
disease or disorder, or is at risk of developing a disorder, associated with aberrant NOV-X 
expression or activity. Disorders associated with aberrant NOV-X expression of activity 
include, for example, disorders of olfactory loss, e.g. trauma, HTV illness, neoplastic growth, 
and neurological disorders, e.g. Parkinson's disease and Alzheimer's disease. 

15 The invention also provides for prognostic (or predictive) assays for determining 

whether an individual is at risk of developing a disorder associated with NOV-X protein, 
nucleic acid expression or activity. For example, mutations in a NOV-X gene can be assayed 
in a biological sample. Such assays can be used for prognostic or predictive purpose to 
thereby prophylactically treat an individual prior to the onset of a disorder characterized by or 

20 associated with NOV-X protein, nucleic acid expression, or biological activity. 

Another aspect of the invention provides methods for determining NOV-X protein, 
nucleic acid expression or activity in an individual to thereby select appropriate therapeutic or 
prophylactic agents for that individual (referred to herein as "pharmacogenomics"). 
Pharmacogenomics allows for the selection of agents (e.g., drugs) for therapeutic or 

25 prophylactic treatment of an individual based on the genotype of the individual (e.g., the 

genotype of the individual examined to determine the abiUty of the individual to respond to a 
particular agent.) 

Yet another aspect of the invention pertains to monitoring the influence of agents (e.g., 
dmgs, compounds) on the expression or activity of NOV-X in clinical trials. 
30 These and other agents are described in further detail in the following sections. 

Diagnostic Assays 

An exemplary method for detecting the presence or absence of NOV-X in a biological 
sample involves obtaining a biological sample fi-om a test subject and contacting the biological 
sample with a compound or an agent capable of detecting NOV-X protein or nucleic acid (e.g., 
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mRNA, genomic DNA) that encodes NOV-X protein such that the presence of NOV-X is 
detected in the biological sample. An agent for detecting NOV-X mKNA or genomic DNA is 
a labeled nucleic acid probe capable of hybridizing to NOV-X roRNA or genomic DNA. The 
nucleic acid probe can be, for example, a full-length NOV-X nucleic acid, such as the nucleic 
5 acid of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57, or a portion thereof, such as 
an oligonucleotide of at least 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to 
specifically hybridize under stringent conditions to NOV-X mRNA or genomic DNA. Other 
suitable probes for use in the diagnostic assays of the invention are described herein. 

One agent for detecting NOV-X protein is an antibody capable of binding to NOV-X 
10 protein, preferably an antibody with a detectable label. Antibodies directed against a protein 
of the invention may be used in methods known within the art relating to the localization 
and/or quantitation of the protein (e.g., for use in measuring levels of the protein within-' 
appropriate physiological samples, for use in diagnostic methods, for use in imaging the 
protein, and the like). In a given embodiment, antibodies against the proteins, or derivatives, 
15 firagments, analogs or homologs thereof, that contain the antigen binding domain, are utilized 
as pharmacologically-active compounds. 

- An antibody specific for a protein of the invention can be used to isolate the protein by 
standard techniques, such as immunoaffinity chromatography or immxmoprecipitation. Such 
an antibody can facilitate the purification of the natural protein antigen firom cells and of 
20 recombinantly produced antigen expressed in host cells. Moreover, such an antibody can be 
used to detect the antigenic protein (e.g., in a cellular lysate or cell supernatant) in order to 
evaluate the abundance and pattern of expression of the antigenic protein. Antibodies directed 
against the protein can be used diagnostically to monitor protein levels in tissue as part of a 
clinical testing procedure, e.g., to, for example, determine the efficacy of a given treatment 
25 regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a 
detectable substance. Examples of detectable substances include various enzymes, prosthetic 
groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive 
materials. Examples of suitable enzymes include horseradish peroxidase, alkaUne 
phosphatase, P-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group 
30 complexes include streptavidin/biotin and avidinA)iotin; examples of suitable fluorescent 
materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, 
dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a 
luminescent material includes luminol; examples of bioluminescent materials include 
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luciferase, luciferin, and aequoiin, and examples of suitable radioactive material include ^^^I, 
^^^I,^^Sor^H. 

Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or 
a fragment thereof (e.g.. Fab or F(ab')2) can be used. The term "labeled", with regard to the 
5 probe or antibody, is intended to encompass direct labeling of the probe of antibody by 

coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as 
indirect labeling of the probe or antibody by reactivity with another reagent that is directly 
labeled. Examples of indirect labeling include detection of a primary antibody using a 
fluorescently-labeled secondary antibody and end-labeling of a DNA probe with biotin such 

10 that it can be detected with fluorescently-labeled streptavidin. The term "biological sample" is 
intended to include tissues, cells and biological fluids isolated from a subject, as well as 
tissues, cells and fluids present within a subject. That is, the detection method of the indention 
can be used to detect NOV-X mRNA, protein, or genomic DNA in a biological sample in vitro 
as well as in vivo. For example, in vitro techniques for detection of NOV-X mRNA include 

15 Northern hybridizations and in situ hybridizations. In vitro techniques for detection of NOV- 
X protein include enzyme linked immunosorbent assays (ELIS As), Westem blots, 
inummoprecipitations, and immunofluorescence. In vitro techniques for detection of NOV-X 
genomic DNA include Soutliem hybridizations. Furthermore, in vivo techniques for detection 
of NOV-X protein include introducing into a subject a labeled anti-NOV-X antibody. For 

20 example, the antibody can be labeled with a radioactive marker whose presence and location 
in a subject can be detected by standard imaging techniques. 

In one embodiment, the biological sample contains protein molecules from the test 
subject. Alternatively, the biological sample can contain mRNA molecules from the test 
subject or genomic DNA molecules from the test subject. A preferred biological sample is a 

25 peripheral blood leukocyte sample isolated by conventional means from a subject. 

In one embodiment, the methods further involve obtaining a control biological sample from a 
control subject, contacting the control sample with a compound or agent capable of detecting 
NOV-X protein, mRNA, or genomic DNA, such that the presence of NOV-X protein, mRNA 
or genomic DNA is detected in the biological sample, and comparing the presence of NOV-X 

30 protein, mRNA or genomic DNA in the control sample with the presence of NOV-X protein, 
mRNA or genomic DNA in the test sample. 

The invention also encompasses kits for detecting the presence of NOV-X in a 
biological sample. For example, the kit can comprise: a labeled compound or agent capable of 
detecting NOV-X protein or mRNA in a biological sample; means for determining the amoimt 
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of NOV-X in the sample; and means for comparing the amount of NOV-X in the sample with 
a standard. The compound or agent can be packaged in a sviitable container. The kit can 
further comprise instructions for using the kit to detect NOV-X protein or nucleic acid. 

Prognostic Assays 

5 The diagnostic methods described herein can furthermore be utilized to identify 

subjects having or at risk of developing a disease or disorder associated with aberrant NOV-X 
expression or activity. For example, the assays described herein, such as the preceding 
diagnostic assays or the following assays, can be utilized to identify a subject having or at risk 
of developing a disorder associated with NOV-X protein, nucleic acid expression or activity, 

10 Such disorders include for example, disorders of olfactory loss, e.g. trauma, HT/ ilbiess, 
neoplastic growth, and neurological disorders, e.g. Parkinson's disease and Alzheimer's^ 
disease. 

Alternatively, the prognostic assays can be utilized to identify a subject having or at 
risk for developing a disease or disorder. Thus, the invention provides a method for 

1 5 identifying a disease or disorder associated with aberrant NOV-X expression or activity in 
which a test sample is obtained from a subject and NOV-X protein or nucleic acid (e.g., 
mRNA, genomic DNA) is detected, wherein the presence of NOV-X protein or nucleic acid is 
diagnostic for a subject having or at risk of developing a disease or disorder associated with 
aberrant NOV-X expression or activity. As used herein, a "test sample" refers to a biological 

20 sample obtained from a subject of interest. For example, a test sample can be a biological 
fluid (e.g., serum), cell sample, or tissue. 

Fvuthermore, the prognostic assays described herein can be used to determine whether 
a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, 
peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder 

25 associated with aberrant NOV-X expression or activity. For example, such methods can be 
used to determine whether a subject can be effectively treated with an agent for a disorder. 
Thus, the invention provides methods for determining whether a subject can be effectively 
treated with an agent for a disorder associated with aberrant NOV-X expression or activity in 
which a test sample is obtained and NOV-X protein or nucleic acid is detected (e.g., wherein 
30 the presence of NOV-X protein or nucleic acid is diagnostic for a subject that can be 

administered the agent to treat a disorder associated with aberrant NOV-X expression or 
activity). 

The methods of the invention can also be used to detect genetic lesions in a NOV-X 
gene, thereby determining if a subject with the lesioned gene is at risk for a disorder 
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characterized by aberrant cell proliferation and/or differentiation. In various embodiments, the 
methods include detecting, in a sample of cells from the subject, the presence or absence of a 
genetic lesion characterized by at least one of an alteration affecting the integrity of a gene 
encoding a NO V-X-protein, or the misexpression of the NOV-X gene. For example, such 
5 genetic lesions can be detected by ascertaining the existence of at least one of: (i) a deletion of 
one or more nucleotides from a NOV-X gene; (ii) an addition of one or more nucleotides to a 
NOV-X gene; (iii) a substitution of one or more nucleotides of a NOV-X gene, (iv) a 
chromosomal rearrangement of a NOV-X gene; (v) an alteration in the level of a messenger 
RNA transcript of a NOV-X gene, (vi) aberrant modification of a NOV-X gene, such as of the 

1 0 methylation pattern of the genomic DNA, (vii) the presence of a non-wild-type splicing 
pattern of a messenger RNA transcript of a NOV-X gene, (viii) a non-wild-type level of a 
NOV-X protein, (ix) alleUc loss of a NOV-X gene, and (x) inappropriate post-translational 
modification of a NOV-X protein. As described herein, there are a large number of assay 
techniques known in the art which can be used for detecting lesions in a NOV-X gene. A 

15 preferred biological sample is a peripheral blood leukocyte sample isolated by conventional 
means from a subject. However, any biological sample containing nucleated cells may be 
used, including, for example, buccal mucosal cells. 

In certain embodiments, detection of the lesion involves the use of a probe/primer in a 
polymerase cham reaction (PGR) (see, e.g., U.S. Patent Nos. 4,683,195 and 4,683,202), such 

20 as anchor PGR or RACE PGR, or, alternatively, in a hgation chain reaction (LCR) (see, e.g., 
Landegran, et al., 1988. Science 241: 1077-1080; and Nakazawa, et al., 1994. Proc. Natl. 
Acad. Sci. USA 91: 360-364), the latter of which can be particularly useful for detecting point 
mutations in the NOV-X-gene (see, Abravaya, et al., 1995. Nucl. Acids Res. 23: 675-682). 
This method can include the steps of collecting a sample of cells from a patient, isolating 

25 nucleic acid (e.g., genomic, mRNA or both) from the cells of the sample, contacting the 

nucleic acid sample with one or more primers that specifically hybridize to a NOV-X gene 
under conditions such that hybridization and amplification of the NOV-X gene (if present) 
occurs, and detecting the presence or absence of an amplification product, or detecting the size 
of the amplification product and comparing the length to a control sample. It is anticipated 

30 that PGR and/or LCR may be desirable to use as a preliminary amplification step in 
conjimction with any of the techniques used for detecting mutations described herein. 

Altemative ampUfication methods include: self sustained sequence replication (see, 
GuateUi, et al., 1990. Proc. Natl. Acad. Sci. USA 87: 1874-1878), transcriptional ampUfication 
system (see, Kwoh, et al., 1989. Proc. Natl. Acad. Sci. USA 86: 1 173-1 177); QP Replicase 
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(see, Lizardi, et al, 1988. BioTechnology 6: 1 197), or any other nucleic acid amplification 
method, followed by the detection of the amplified molecules using techniques well known to 
those of skill in the art. These detection schemes are especially useful for the detection of 
nucleic acid molecules if such molecules are present in very low nimibers. 
5 In an altemative embodiment, mutations in a NOV-X gene from a sample cell can be 

identified by alterations in restriction enzyme cleavage pattems. For example, sample and 
control DNA is isolated, amplified (optionally), digested with one or more restriction 
endonucleases, and fragment length sizes are determined by gel electrophoresis and compared. 
Differences in fragment length sizes between sample and control DNA indicates mutations in 
10 the sample DNA. Moreover, the use of sequence specific ribozynies (see, e.g., U-S. Patent 

No. 5,493,531) can be used to score for the presence of specific mutations by development or 
loss of a ribozyme cleavage site. 

In other embodiments, genetic mutations in NOV-X can be identified by hybridizing a 
sample and control nucleic acids, e.g., DNA or RNA, to high-density arrays contaming 
15 hundreds or thousands of ohgonucleotides probes. See, e.g., Cronin, et al., 1996. Human 
Mutation 7: 244-255; Kozal, et al, 1996. Nat. Med. 2: 753-759. For example, genetic 
mutations in NOV-X can be identified in two dimensional arrays containing light-generated 
DNA probes as described in Cronin, et al., supra. Briefly, a first hybridization array of probes 
can be used to scan tlirough long stretches of DNA in a sample and control to identify base 
20 changes between the sequences by making Imear arrays of sequential overlapping probes. 
This step allows the identification of point mutations. This is followed by a second 
hybridization array that allows the characterization of specific mutations by using smaller, 
specialized probe arrays complementary to all variants or mutations detected. Each mutation 
array is composed of parallel probe sets, one complementary to the wild-type gene and the 
25 other complementary to the mutant gene. 

In yet another embodiment, any of a variety of sequencing reactions known in the art can be 
used to directly sequence the NOV-X gene and detect mutations by comparing the sequence of 
the sample NOV-X with the corresponding wild-type (control) sequence. Examples of 
sequencing reactions include those based on techniques developed by Maxim and Gilbert, 
30 1977. Proc. Natl. Acad. Sci. USA 74: 560 or Sanger, 1977. Proc. Natl. Acad. Sci. USA 74: 

5463. It is also contemplated that any of a variety of automated sequencing procedures can be 
utiUzed when performing the diagnostic assays (see, e.g., Naeve, et al., 1995. Biotechniques 
19: 448), including sequencing by mass spectrometry (see, e.g., PCT International Publication 
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No. WO 94/16101; Cohen, et al„ 1996, Adv. Chromatograpliy 36: 127-162; and Griffin, et al., 
1993. Appl. Biochem. Biotechnol. 38: 147-159). 

Other methods for detecting mutations in the NOV-X gene include methods in which 
protection from cleavage agents is used to detect mismatched bases in RNA/RNA or 
5 RNA/DNA heteroduplexes. See, e.g., Myers, et al., 1985. Science 230: 1242. In general, the 
art technique of "mismatch cleavage" starts by providing heteroduplexes of formed by 
hybridizing (labeled) RNA or DNA containing the wild-type NOV-X sequence with 
potentially mutant RNA or DNA obtained from a tissue sample. The double-stranded 
duplexes are treated with an agent that cleaves single-stranded regions of the duplex such as 

10 which will exist due to basepair mismatches between the control and sample strands. For 

instance, RNA/DNA duplexes can be treated with RNase and DNA/DNA hybrids treated with 
Si nuclease to enzymatically digesting the mismatched regions. In other embodimentSj^^either 
DNA/DNA or RNA/DNA duplexes can be treated with hydroxylamine or osmium tetroxide 
and with piperidine in order to digest mismatched regions. After digestion of the mismatched 

15 regions, the resulting material is then separated by size on denaturing polyacrylamide gels to 
determine the site of mutation. See, e.g.. Cotton, et al., 1988. Proc. Natl. Acad. Sci. USA 85: 
4397; Saleeba, et al., 1992. Methods En2ymol. 217: 286-295. In an embodiment, the control 
DNA or RNA can be labeled for detection. 

In still another embodiment, the mismatch cleavage reaction employs one or more 

20 proteins that recognize mismatched base pairs in double-stranded DNA (so called "DNA 

mismatch repair" enzymes) in defined systems for detecting and mapping point mutations in 
NOV-X cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli 
cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T 
at G/T mismatches. See, e.g., Hsu, et al, 1994. Carcinogenesis 15: 1657-1662. According to 

25 an exemplary embodiment, a probe based on a NOV-X sequence, e.g., a wild-type NOV-X 
sequence, is hybridized to a cDNA or other DNA product from a test cell(s). The duplex is 
treated with a DNA mismatch repair enzyme, and the cleavage products, if any, can be 
detected from electrophoresis protocols or the like. See, e.g., U.S. Patent No. 5,459,039. 

In other embodiments, alterations in electrophoretic mobility will be used to identify 

30 mutations in NOV-X genes. For example, single strand conformation polymorphism (SSCP) 
may be used to detect differences in electrophoretic mobility between mutant and wild type 
nucleic acids. See, e.g., Orita, et al., 1989. Proc. Natl. Acad. Sci. USA: 86: 2766; Cotton, 
1993. Mutat. Res. 285: 125-144; Hayashi, 1992. Genet. Anal. Tech. Appl. 9: 73-79. 
Single-stranded DNA fragments of sample and control NOV-X nucleic acids will be denatured 
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and allowed to renature. The secondary structure of single-stranded nucleic acids varies 
according to sequence, the resultmg alteration in electrophoretic mobility enables the detection 
of even a single base change. The DNA fragments may be labeled or detected with labeled 
probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in 
5 which the secondary structure is more sensitive to a change in sequence. In one embodiment, 
the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex 
molecules on the basis of changes in electrophoretic mobility. See, e.g.. Keen, et al., 1991. 
Trends Genet. 7: 5. 

In yet another embodiment, the movement of mutant or wild-type fragments in 

10 polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient 
gel electrophoresis (DGGE). See, e.g., Myers, et al., 1985. Nature 313: 495. When DGGE is 
used as the method of analysis, DNA will be modified to insure that it does not completely 
denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich 
DNA by PGR. In a further embodiment, a temperature gradient is used in place of a 

15 denaturing gradient to identify differences in the mobility of control and sample DNA. See, 
e.g., Rosenbaum and Reissner, 1987. Biophys. Chem. 265: 12753. 

Examples of other techniques for detecting point mutations include, but are not limited 
to, selective oligonucleotide hybridization, selective amplification, or selective primer 
extension. For example, oligonucleotide primers may be prepared in which the known 

20 mutation is placed centrally and then hybridized to target DNA imder conditions that permit 
hybridization only if a perfect match is found. See, e.g., Saiki, et al., 1986. Nature 324: 163; 
Saiki, et al., 1989. Proc. Natl. Acad. Sci. USA 86: 6230. Such allele specific oligonucleotides 
are hybridized to PGR amplified target DNA or a number of different mutations when the 
oUgonucleotides are attached to the hybridizing membrane and hybridized with labeled target 

25 DNA. 

Ahematively, allele specific amplification technology that depends on selective PGR 
amplification may be used in conjunction with the instant invention. Oligonucleotides used as 
primers for specific amplification may carry the mutation of interest in the center of the 
molecule (so that amplification depends on differential hybridization; see, e.g., Gibbs, et al., 
30 1989. Nucl. Acids Res. 17: 2437-2448) or at the extreme 3 -terminus of one primer where, 

under appropriate conditions, mismatch can prevent, or reduce polymerase extension (see, e.g., 
Prossner, 1993. Tibtech. 1 1: 238). In addition it may be desirable to introduce a novel 
restriction site in the region of the mutation to create cleavage-based detection. See, e.g., 
Gasparini, et al., 1992. Mol. Cell Probes 6: 1. It is anticipated that in certain embodiments 
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amplification may also be perfomied using Taq ligase for amplification. See, e.g., Barany, 
1991. Proc. Natl. Acad. Sci. USA 88: 189. In such cases, ligation will occur only if tiiere is a 
perfect match at the S'-terminus of the 5* sequence, making it possible to detect the presence of 
a known mutation at a specific site by looking for the presence or absence of amplification. 
5 The methods described herein may be performed, for example, by utilizing 

pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent 
described herein, which may be conveniently used, e.g., in clinical settings to diagnose 
patients exhibiting symptoms or family history of a disease or illness involving a NOV-X 
gene. 

1 0 : Furthermore, any cell type or tissue, preferably peripheral blood leukocytes, in which 
NOV-X is expressed may be utilized in the prognostic assays described herein. However, any 
biological sample containing nucleated cells may be used, including, for example, bucckl 
mucosal cells. 

15 Ph ar macogenomics 

Agents, or modulators that have a stimulatory or inhibitory effect on NOV-X activity 
(e.g., NOV-X gene expression), as identified by a screening assay described herein can be 
administered to individuals to treat (prophylactically or therapeutically) disorders (e.g. 
disorders of olfactory loss, e.g. trauma, HIV illness, neoplastic growth, and neurological 

20 disorders, e.g. Parkinson's disease and Alzheimer's disease). In conjunction with such 

treatment, the pharmacogenomics (i.e., the study of the relationship between an individual's 
genotype and that individual's response to a foreign compound or drag) of the individual may 
be considered. Differences in metaboHsm of therapeutics can lead to severe toxicity or 
therapeutic failure by altering the relation between dose and blood concentration of the 

25 pharmacologically active drag. Thus, the pharmacogenomics of the individual permits the 

selection of effective agents (e.g., drugs) for prophylactic or therapeutic treatments based on a 
consideration of the individual's genotype. Such pharmacogenomics can fiirther be used to 
determine appropriate dosages and therapeutic regimens. Accordingly, the activity of NOV-X 
protein, expression of NOV-X nucleic acid, or mutation content of NOV-X genes in an 

30 individual can be determined to thereby select appropriate agent(s) for therapeutic or 
prophylactic treatment of the individual. 

Pharmacogenomics deals with clinically significant hereditary variations in the 
response to drags due to altered drag disposition and abnormal action in affected persons. See 
e.g., Eichelbaum, 1996. Clm. Exp. Pharmacol. Physiol., 23: 983-985; Linder, 1997. Clin. 
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Chem., 43 : 254-266. In general, two types of phannacogenetic conditions can be 
differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on 
the body (altered drug action) or genetic conditions transmitted as single factors altering the 
way the body acts on drugs (altered drug metabolism). These phannacogenetic conditions can 
occur either as rare defects or as polymorphisms. For example, glucose-6-phosphate 
dehydrogenase (G6PD) deficiency is a common inherited enzymopathy in which the main 
clinical compUcation is hemolysis after ingestion of oxidant drugs (anti-malarials, 
sulfonamides, analgesics, nitrofurans) and consxunption of fava beans. 

As an illustrative embodiment, the activity of dmg metabolizing enzymes is a major 
determinant of both the intensity and duration of drug action. The discovery of genetic 
polymorphisms of drug metabolizing enzymes (e.g., N-acetyltransferase 2 (NAT 2) and 
cytochrome P450 enzymes CYP2D6 and CYP2C19) has provided an explanation as to why 
some patients do not obtain the expected drug effects or show exaggerated drug response and 
serious toxicity after taking the standard and safe dose of a drug. These polymorphisms are 
expressed in two phenotypes in the population, the extensive metabolizer (EM) and poor 
metabolizer (PM). The prevalence of PM is different among different populations. For 
example, the gene coding for CYP2D6 is highly polymorphic and several mutations have been 
identified in PM, which all lead to the absence of fimctional CYP2D6. Poor metabolizers of 
CYP2D6 and CYP2C19 quite fi*equently experience exaggerated drug response and side 
effects when they receive standard doses. If a metaboUte is the active therapeutic moiety, PM 
show no therapeutic response, as demonstrated for the analgesic effect of codeine mediated by 
its CYP2D6-formed metabolite morphine. At the other extreme are the so called ultra-rapid 
metabolizers who do not respond to standard doses. Recently, the molecular basis of 
ultra-rapid metabolism has been identified to be due to CYP2D6 gene amplification. 

Thus, the activity of NOV-X protein, expression of NOV-X nucleic acid, or mutation 
content of NOV-X genes in an individual can be determined to thereby select appropriate 
agent(s) for therapeutic or prophylactic treatment of the individual. In addition, 
phannacogenetic studies can be used to apply genotyping of polymorphic alleles encoding 
drug-metabolizing enzymes to the identification of an individual's drug responsiveness 
phenotype. This knowledge, when applied to dosing or drug selection, can avoid adverse 
reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when 
treating a subject with a NOV-X modulator, such as a modulator identified by one of the 
exemplary screening assays described herein. 
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Monitoring of Effects During Clinical Trials 

V Monitoring the influence of agents (e.g., drags, compounds) on the expression or 
activity of NOV-X (e.g., the ability to modulate aberrant cell proliferation) can be applied not 
only in basic drug screening, but also in clinical trials. For example, the effectiveness of an 
5 agent determined by a screening assay as described herein to increase NOV-X gene 

expression, protein levels, or upregulate NOV-X activity, can be monitored in clinical trails of 
subjects exhibiting decreased NOV-X gene expression, protein levels, or downregulated 
NOV-X activity. Alternatively, the effectiveness of an agent determined by a screening assay 
to decrease NOV-X gene expression, protein levels, or downregulate NOV-X activity, can be 

10 monitored in clinical trails of subjects exhibiting increased NOV-X gene expression, protein 
levels, or upregulated NOV-X activity. In such clinical trials, the expression or activity of 
NOV-X and, preferably, other genes that have been implicated in, for example, a cellular 
proliferation or immune disorder can be used as a "read out" or markers of the immime 
responsiveness of a particular cell. 

15 By way of example, and not of limitation, genes, including NOV-X, that are modulated 

in cells by treatment with an agent (e.g., compound, drag or small molecule) that modulates 
NOV-X activity (e.g., identified in a screening assay as described herein) can be identified. 
Thus, to study the effect of agents on cellular proliferation disorders, for example, in a clinical 
trial, cells can be isolated and RNA prepared and analyzed for the levels of expression of 

20 NOV-X and other genes implicated in the disorder. The levels of gene expression (i.e., a gene 
expression pattern) can be quantified by Northem blot analysis or RT-PCR, as described 
herein, or alternatively by measuring the amount of protein produced, by one of the methods 
as described herein, or by measuring the levels of activity of NOV-X or other genes. In this 
manner, the gene expression pattem can serve as a marker, indicative of the physiological 

25 response of the cells to the agent. Accordingly, this response state may be determined before, 
and at various points during, treatment of the individual with the agent. 

In one embodiment, the invention provides a method for monitoring the effectiveness 
of treatment of a subject with an agent (e.g., an agonist, antagonist, protein, peptide, 
peptidomimetic, nucleic acid, small molecule, or other drag candidate identified by the 

30 screening assays described herein) comprising the steps of (i) obtaining a pre-administration 

sample firom a subject prior to administration of the agent; (ii) detecting the level of expression 
of a NOV-X protein, mRNA, or genomic DNA in the preadministration sample; (iii) obtaining 
one or more post-administration samples firom the subject; (iv) detecting the level of 
expression or activity of the NOV-X protein, mRNA, or genomic DNA in the 
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post-administration samples; (v) comparing the level of expressioii or activity of the NOV-X 
protein, mRNA, or genomic DNA in tiie pre-administration sample with die NOV-X protein, 
mRNA, or genomic DNA in the post administration sample or samples; and (vi) altering the 
administration of the agent to the subject accordingly. For example, increased administration 
of the agent may be desirable to increase the expression or activity of NOV-X to higher levels 
than detected, i.e., to increase the effectiveness of the agent. Alternatively, decreased 
administration of the agent may be desirable to decrease expression or activity of NOV-X to 
lower levels than detected, i.e., to decrease the effectiveness of the agent. 

Methods of Treatment 

The invention provides for both prophylactic and therapeutic methods of treating a 
subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant 
NOV-X expression or activity. Disorders associated with aberrant NOV-X expression include, 
for example, disorders of olfactory loss, e.g. trauma, HTV ilhiess, neoplastic growth, and 
neurological disorders, e.g. Parkinson's disease and Alzheimer's disease. 

These methods of treatment will be discussed more fully, below. 

Disease and Disorders 

Diseases and disorders that are characterized by increased (relative to a subject not 
suffering from the disease or disorder) levels or biological activity may be treated with 
Therapeutics that antagonize (i.e., reduce or inhibit) activity. Therapeutics that antagonize 
activity may be administered in a tlierapeutic or prophylactic manner. Therapeutics that may 
be utilized include, but are not limited to: (i) an aforementioned peptide, or analogs, 
derivatives, fragments or homologs thereof; (ii) antibodies to an aforementioned peptide; (iii) 
nucleic acids encoding an aforementioned peptide; (iv) administration of antisense nucleic 
acid and nucleic acids that are "dysfunctional" (i.e., due to a heterologous insertion within the 
coding sequences of coding sequences to an aforementioned peptide) that are utilized to 
"knockout" endogenous function of an aforementioned peptide by homologous recombination 
(see, e.g., Capecchi, 1989. Science 244: 1288-1292); or (v) modulators ( i.e., inhibitors, 
agonists and antagonists, including additional peptide mimetic of flie invention or antibodies 
specific to a peptide of the invention) that alter the interaction between an aforementioned 
peptide and its binding partner. 

Diseases and disorders that are characterized by decreased (relative to a subject not 
suffering from the disease or disorder) levels or biological activity may be tireated with 
Therapeutics that increase (i.e., are agonists to) activity. Therapeutics that upregulate activity 
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may be administered in a therapeutic or prophylactic maimer. Therapeutics that may be 
utilized include, but are not limited to, an aforementioned peptide, or analogs, derivatives, 
fragments or homologs thereof; or an agonist that increases bioavailability. 

Increased or decreased levels can be readily detected by quantifying peptide and/or 
5 RNA, by obtaining a patient tissue sample (e,g., from biopsy tissue) and assaying it in vitro for 
RNA or peptide levels, structure and/or activity of the expressed peptides (or mRNAs of an 
aforementioned peptide). Methods that are well-known within the art include, but are not 
limited to, inmiunoassays (e.g., by Westem blot analysis, immunoprecipitation followed by 
sodium dodecyl sulfate (SDS) polyacrylamide gel electrophoresis, immunocytochemistry, etc.) 
10 and/or hybridization assays to detect expression of mRNAs (e.g., Northem assays, dot blots, in 
situ hybridization, and the like). 

Prophylactic Methods 

hi one aspect, the invention provides a method for preventing, in a subject, a disease or 
condition associated with an aberrant NOV-X expression or activity, by administering to the 

15 subject an agent that modulates NOV-X expression or at least one NOV-X activity. Subjects 
at risk for a disease that is caused or contributed to by aberrant NOV-X expression or activity 
can be identified by, for example, any or a combination of diagnostic or prognostic assays as 
described herein. Administration of a prophylactic agent can occur prior to the manifestation 
of symptoms characteristic of the NOV-X aberrancy, such that a disease or disorder is 

20 prevented or, alternatively, delayed in its progression. Depending upon the type of NOV-X 

aberrancy, for example, a NOV-X agonist or NOV-X antagonist agent can be used for treating 
the subject. The appropriate agent can be determined based on screening assays described 
herein. The prophylactic methods of the invention are further discussed in the following 
subsections. 

25 Therapeutic Methods 

Another aspect of the invention pertains to methods of modulating NOV-X expression 
or activity for therapeutic purposes. The modulatory method of the invention involves 
contacting a cell with an agent that modulates one or more of the activities of NOV-X protein 
activity associated with the cell. An agent that modulates NOV-X protein activity can be an 

30 agent as described herein, such as a nucleic acid or a protein, a naturally-occurring cognate 

Ugand of a NOV-X protein, a peptide, a NOV-X peptidomimetic, or other small molecule. In 
one embodiment, the agent stimulates one or more NOV-X protein activity. Examples of such 
stimulatory agents include active NOV-X protein and a nucleic acid molecule encoding NOV- 
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X that has been introduced into the cell. In another embodiment, the agent inhibits one or 
more NOV-X protein activity. Examples of such inhibitory agents include antisense NOV-X 
nucleic acid molecules and anti-NOV-X antibodies. These modulatory methods can be 
performed in vitro (e.g., by culturing the cell with the agent) or, altematively, in vivo (e.g., by 
5 administering the agent to a subject). As such, the invention provides methods of treating an 
individual afflicted with a disease or disorder characterized by aberrant expression or activity 
of a NOV-X protein or nucleic acid molecule. In one embodiment, the method involves 
adndnistering an agent (e.g., an agent identified by a screening assay described herein), or 
combination of agents that modulates (e.g., up-regulates or down-regulates) NOV-X 

10 expression or activity. In anotlier embodiment, the method involves administering a NOV-X 
protein or nucleic acid molecule as therapy to compensate for reduced or aberrant NOV-X 
expression or activity. ^ 

Stimulation of NOV-X activity is desirable in situations in which NOV-X is 
abnormally downregulated and/or in which increased NOV-X activity is likely to have a 

15 beneficial effect. One example of such a situation is where a subject has a disorder 

characterized by aberrant cell proliferation and^or differentiation (e.g., cancer or immune 
associated ). Another example of such a situation is where the subject has an 
inmiunodeficiency disease (e.g., AIDS). 

Antibodies of the invention, including polyclonal, monoclonal, humanized and fiiUy 

20 human antibodies, may used as therapeutic agents. Such agents will generally be employed to 
treat or prevent a disease or pathology in a subject. An antibody preparation, preferably one 
having high specificity and high affinity for its target antigen, is administered to tiie subject 
and will generally have an effect due to its binding with the target. Such an effect may be one 
- - of two kinds, depending on the specific nature of the interaction between the given antibody 

25 molecule and the target antigen in question. In the first instance, administration of the 

antibody may abrogate or inhibit the binding of the target with an endogenous ligand to which 
it naturally binds. In this case, the antibody binds to the target and masks a binding site of the 
naturally occurring ligand, wherein the ligand serves as an effector molecule. Thus the 
receptor mediates a signal transduction pathway for which ligand is responsible. 

30 Altematively, the eflFect may be one in which the antibody elicits a physiological result 

by virtue of binding to an effector binding site on the target molecule. In this case the target, a 
receptor having an endogenous ligand which may be absent or defective in the disease or 
pathology, binds the antibody as a surrogate effector ligand, initiating a receptor-based signal 
transduction event by the receptor. 
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~ A therapeutically effective amount of an antibody of the invention relates generally to 
the amoimt needed to achieve a therapeutic objective. As noted above, this may be a binding 
interaction between the antibody and its target antigen that, in certain cases, interferes with the 
functioning of the target, and in other cases, promotes a physiological response. The amoxmt 
5 required to be administered will furthermore depend on the binding affinity of the antibody for 
its specific antigen, and will also depend on the rate at which an administered antibody is 
depleted firom the fi"ee volume other subject to which it is administered. Common ranges for 
therapeutically effective dosing of an antibody or antibody firagment of the invention may be, 
by way of nonlimiting example, firom about O.I mg/kg body weigjit to about 50 mg/kg body 
10 weight. Common dosing frequencies may range, for example, firom twice daily to once a 
week. 

Determination of the Biological Effect of the Therapeutic 

In various embodiments of the invention, suitable in vitro or in vivo assays are 
15 performed to determine the effect of a specific Therapeutic and whether its administration is 

indicated for treatment of the affected tissue. 

In various specific embodiments, in vitro assays may be performed with representative 

cells of the type(s) involved in the patient's disorder, to determine if a given Therapeutic exerts 

the desired effect upon the cell type(s). Compounds for use in therapy may be tested in 
20 suitable animal model systems including, but not limited to rats, mice, chicken, cows, 

monkeys, rabbits, and the like, prior to testing in human subjects. Similarly, for in vivo 

testing, any of the animal model system known in the art may be used prior to administration 

to hiunan subjects. 

25 The invention will be further described in the following examples, which do not limit 

the scope of the invention described in the claims. 

Examples 

Example 1 : Quantitative Expression Analvsis of NOV-L NOV-2, NOV-3> and NOV>4 in 
30 various cells and tissues. 

RTQ-PCR Panel Descriptions: 
Panel 1 

As shown in the expression data in Tables 39, 40, and 41, Panel 1 of each table is 
composed of RNA or cDNA isolated firom various human cells or cell lines from noraial and 
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cancerous tissue. These cells and cell lines have been extensively characterized by 
investigators in both academia and the commercial sectorregarding their tumorgenicity, 
metastatic potential, drug resistance, invasive potential, and other cancer-related properties. 
They serve as suitable tools for pre-clinical exvaluation of anti-cancer agents and promising 
5 therapeutic strategies. 

Panel 2: 

In Tables 39, 40, and 41, Panel 2 of each table includes 2 control wells and 94 test 
samples composed of RNA or cDNA isolated from human tissue procured by surgeons 

10 working in close cooperation with the National Cancer Listitute's Cooperative Human Tissue 
Network (CHTN) or the National Disease Research Initiative (NDRT). The tissues are derived 
from human malignancies and in cases where indicated, many malignant tissues have ^ 
"matched margins", which is non-cancerous tissue adjacent to the tumor. These are termed 
normal adjacent tissues and are denoted 'TSTAT" in Tables 39, 40, and 41. The tumor tissue 

15 and the matched margins are evaluated by two independent pathologists at NDRI or CHTN. 
This analysis provides a gross histopathological assessment of tumor differentiation grade. 
Moreover, most samples include the original surgical pathology report that provides 
information regarding the clinical stage of the patient. In addition, these RNA and cDNA 
samples were obtained from various human tissues derived from autopsies performed on 

20 elderly people or sudden death victims (accidents, etc.). These tissue were ascertained to be 
free of disease and were purchased from various commercial sources such as Clontech (Palo 
Alto, CA), Research Genetics, and Invitrogen. 

RNA integrity from all samples is controlled for quality by visual assessment of 
agarose gel electropherograms using 28S and 18S ribosomal RNA staining intensity ratio as a 

25 guide (2:1 to 2.5:1 28s: 18s) and the absence of low molecular weight RNAs that would be 
indicative of degradation products. Samples are controlled against genomic DNA 
contamination by RTQ PGR reactions run in the absence of reverse transcriptase using probe 
and primer sets designed to amplify across the span of a single exon. 

30 Panel 3: 

Panel 3 in Tables 39, 40, and 41, include samples on a 96 well plate (2 control wells, 
94 test samples) composed of RNA or cDNA isolated from various human cell lines or tissues 
related to inflammatory conditions. Total RNA from control normal tissues such as colon and 
lung (Stratagene, La Jolla, CA) and thymus and kidney (Clontech) were employed. Total 
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RNA from liver tissue from cirrhosis patients and kidney from lupus patients was obtained 
from BioChain (Biochain Institute, Inc., Haj^ard, CA). Intestinal tissue for RNA preparation 
from patients diagnosed as having Crohn's disease and ulcerative colitis was obtained from the 
National Disease Research Interchange (NDRI) (Philadelphia, PA). 

5 Astrocytes, lung fibroblasts, dermal fibroblasts, coronary artery smooth muscle cells, 

small airway epithelium, bronchial epithelium, microvascular dermal endothelial cells, 
microvascular lung endothelial cells, human pulmonary aortic endothelial cells, human 
lunbilical vein endothelial cells were all purchased from Clonetics (Walkersville, MD) and 
grown in the media supplied for these cell types by Clonetics. These primary cell types were 

10 activated with various cytokines or combinations of cytokines for 6 and/or 12-14 hours, as 
indicated. The following cytokines were used; IL-1 beta at approximately 1-5 ng/ml, TNF 
alpha at approximately 5-10 ng/ml, IFN gamma at approximately 20-50 ng/ml, rL-4 at 
approximately 5-10 ng/ml, IL-9 at approximately 5-10 ng/ml, IL-13 at approximately 5-10 
ng/ml. Endothelial cells were sometimes starved for various times by culture in the basal 

1 5 media from Clonetics with 0.1% serum. 

Mononuclear cells were prepared from blood of employees at CuraGen Corporation, 
using FicolL LAK cells were prepared from these cells by culture in DMEM 5% PCS 
(Hyclone), 100 jiM non essential amino acids (Gibco/Life Technologies, Rockville, MD), 1 
mM sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10'^ M (Gibco), and 10 mM Hepes 

20 (Gibco) and Interleukin 2 for 4-6 days. Cells were then either activated with 10-20 ng/ml 

PMA and 1-2 jig/ral ionomycin, IL-12 at 5-10 ng/ml, IFN gamma at 20-50 ng/ml and IL-18 at 
5-10 ng/ml for 6 hours. In some cases, mononuclear cells were cultured for 4-5 days in 
DMEM 5% PCS (Hyclone), 100 fxM non essential amino acids (Gibco), 1 mM sodium 
pyravate (Gibco), mercaptoethanol 5.5 x 10'^ M (Gibco), and 10 mM Hepes (Gibco) with 

25 PHA (phytohemagglutinin) or PWM (pokeweed mitogen) at approximately 5 |ig/mL Samples 
were taken at 24, 48 and 72 hours for RNA preparation. MLR (mixed lymphocyte reaction) 
samples were obtained by taking blood from two donors, isolating the mononuclear cells using 
FicoU and mixing the isolated mononuclear cells 1:1 at a final concentration of approximately 
2x10^ cells/ml in DMEM 5% PCS (Hyclone), 100 \xM non essential amino acids (Gibco), 1 

30 mM sodium pyruvate (Gibco), mercaptoethanol (5.5 x 10"^ M) (Gibco), and 10 mM Hepes . 
(Gibco). The MLR was cultured and samples taken at various time points ranging from 1-7 
days for RNA preparation. 

Monocytes were isolated from mononuclear cells using CD 14 Miltenyi Beads, +ve VS 
selection columns and a Vario Magnet according to the manufacturer's instmctions. 
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Monocytes were differentiated into dendritic cells by culture in DMEM 5% fetal calf serum 
(FCS) (Hyclone, Logan, UT), 100 |J.M non essential amino acids (Gibco), 1 mM sodium 
pyruvate (Gibco), mercaptoethanol 5.5 x 10"^ M (Gibco), and 10 mM Hepes (Gibco), 50 ng/ml 
GMCSF and 5 ng/ml IL-4 for 5-7 days. Macrophages were prepared by culture of monoc3^es 
5 for 5-7 days in DMEM 5% FCS Cyclone), 100 yM non essential amino acids (Gibco), 1 mM 
sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10'^ M (Gibco), 10 mM Hepes (Gibco) and 
10% AB Hxunan Serum or MCSF at approximately 50 ng/ml. Monocytes, macrophages and 
dendritic cells were stimulated for 6 and 12-14 hours with lipopolysaccharide (LPS) at 100 
ng/ml. Dendritic cells were also stimulated with anti-CD40 monoclonal antibody 

10 (Pharmingen) at 10 ng/ml for 6 and 12-14 hours. 

CD4 lymphocytes, CDS lymphocytes and NK cells were also isolated from 
mononuclear cells using CD4, CDS and CD56 Miltenyi beads, positive VS selection columns 
and a Vario Magnet according to the manufacturer's instructions. CD45RA and CD45RO CD4 
lymphocytes were isolated by depleting mononuclear cells of CDS, CD56, CD 14 and CD 19 

1 5 cells using CDS, CD56, CD14 and CD19 Miltenyi beads and +ve selection. Then CD45RO 
beads were used to isolate the CD45RO CD4 lymphocytes with the remaining cells being 
CD45RA CD4 lymphocytes. CD45RA CD4, CD45RO CD4 and CDS lymphocytes were 
placed in DMEM 5% FCS (Hyclone), 100 nM non essential amino acids (Gibco), 1 mM 
sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10'^ M (Gibco), and 10 mM Hepes (Gibco) 

20 and plated at 10^ cells/ml onto Falcon 6 well tissue culture plates that had been coated 

overnight with 0.5 ^ig/ml anti-CD28 (Phamiingen) and 3 ug/ml anti-CD3 (OKT3, ATCC) in 
PBS. After 6 and 24 hours, the cells were harvested for RNA preparation. To prepare 
chronically activated CDS lymphocytes, we activated the isolated CDS lymphocytes for 4 days 
on anti-CD28 and anti-CD3 coated plates and then harvested the cells and expanded them in 

25 DMEM 5% FCS (Hyclone), 100 |iM non essential amino acids (Gibco), 1 mM sodixrai 

pymvate (Gibco), mercaptoethanol 5.5 x 10"^ M (Gibco), and 10 mM Hepes (Gibco) and IL-2. 
The expanded CDS cells were then activated again with plate boimd anti-CD3 and anti-CD28 
for 4 days and expanded as before. RNA was isolated 6 and 24 hours after the second 
activation and after 4 days of the second expansion culture. The isolated NK cells were 

30 cultured in DMEM 5% FCS (Hyclone), 100 |aM non essential amino acids (Gibco), 1 mM 
sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10"^ M (Gibco), and 10 mM Hepes (Gibco) 
and IL'2 for 4-6 days before RNA was prepared. 

To obtain B cells, tonsils were procured fi*om NDRI. The tonsil was cut up with sterile 
dissecting scissors and then passed through a sieve. Tonsil cells were then spim down and 
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resupended at 10^ cells/ml in DMEM 5% FCS (Hyclone), 100 \xM non essential amino acids 
(Gibco), 1 mM sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10"^ M (Gibco), and 10 mM 
Hepes (Gibco). To activate the cells, we used PWM at 5 |ig/ml or anti-CD40 (Phaimingen) at 
approximately 10 jig/ml and IL-4 at 5-10 ng/ml. Cells were harvested for RNA preparation at 
5 24,48 and 72 hours. 

To prepare the primary and secondary Thl/Th2 and Trl cells, six-well Falcon plates 
were coated overnight with 10 ^g/ml anti-CD28 (Pharmingen) and 2 ^tg/ml OKT3 (ATCC), 
and then washed twice with PBS. Umbilical cord blood CD4 lymphoc3rtes (Poietic Systems, 

5 6 

German Town, MD) were cultured at 10 -10 cells/ml in DMEM 5% FCS (Hyclone), 100 |iM 

10 ndn essential amino acids (Gibco), 1 mM sodiimi pyruvate (Gibco), mercaptoethanol 5.5 x 10' 
^ M (Gibco), 10 mM Hepes (Gibco) and IL-2 (4 ng/ml). IL-12 (5 ng/ml) and anti-IL4 (1 
\xg/ml) were used to direct to Thl, while IL-4 (5 ng/ml) and anti-IFN gamma (1 jig/ml) were 
used to direct to Th2 and BL-IO at 5 ng/ml was used to direct to Trl . After 4-5 days, the 
activated Thl, Th2 and Trl lymphocytes were yvashed once in DMEM and expanded for 4-7 

15 days in DMEM 5% FCS (Hyclone), 100 fiM non essential amino acids (Gibco), 1 mM sodium 
pyruvate (Gibco), mercaptoethanol 5.5 x 10'^ M (Gibco), 10 mM Hepes (Gibco) and IL-2 (1 
ng/ml). Following this, the activated Thl, Th2 and Trl lymphocytes were re-stimulated for 5 
days with anti-CD28/OKT3 and cytokines as described above, but with the addition of anti- 
CD95L (1 ^ig/ml) to prevent apoptosis. After 4-5 days, the Thl, Th2 and Trl lymphocytes 

20 were washed and then expanded again with IL-2 for 4-7 days. Activated Thl and Th2 

lymphocytes were maintained in this way for a maximum of three cycles. RNA was prepared 
from primary and secondary Thl, Th2 and Trl after 6 and 24 hours following the second and 
third activations with plate boimd anti-CD3 and anti-CD28 mAbs and 4 days into the second 
and third expansion cultures in Interleukin 2. 

25 The following leukocyte cells lines were obtained from the ATCC: Ramos, EOL-1, 

KU-812. EOL cells were fiirther differentiated by culture in 0.1 inM dbcAMP at 5 xlO^ 
cells/ml for 8 days, changing the media every 3 days and adjusting the cell concentration to 5 
xlO^ cells/ml. For the culture of these cells, we used DMEM or RPMI (as recommended by 
the ATCC), with the addition of 5% FCS (Hyclone), 100 jiM non essential amino acids 

30 (Gibco), 1 mM sodium pyruvate (Gibco), mercaptoethanol 5,5 x 10'^ M (Gibco), 10 mM 

Hepes (Gibco). RNA was eitlier prepared from resting cells or cells activated with PMA at 10 
ng/ml and ionomycin at 1 \xg/ml for 6 and 14 hours. Keratinocyte line CCD106 and an airway 
epithelial tumor line NCI-H292 were also obtained from the ATCC. Both were cultured in 
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DMEM 5% FCS (Hyclone), 100 ixM non essential amino acids (Gibco), 1 mM sodium 
pyruvate (Gibco), mercaptoethanol 5.5 x 10'^ M (Gibco), and 10 mM Hepes (Gibco). 
CCD 1 106 cells were activated for 6 and 14 hours with approximately 5 ng/ml TNF alpha and 
1 ng/ml IL-1 beta, while NCI-H292 cells were activated for 6 and 14 hours with the following 
5 cytokines: 5 ng/ml IL-4, 5 ng/ml IL-9, 5 ng/ml IL-1 3 and 25 ng/ml IFN gamma. 

For these cell lines and blood cells, RNA was prepared by lysing approximately 1 0^ 
cells/ml using Trizol (Gibco BRL). Briefly, 1/10 volimie of bromochloropropane (Molecular 
Research Corporation) was added to the RNA sample, vortexed and after 10 minutes at room 
temperature, the tubes were spun at 14,000 ipm in a Sorvall SS34 rotor. The aqueous phase 

10 was removed and placed in a 15 ml Falcon Tube. An equal volume of isopropanol was added 
and left at -20 degrees C overnight. The precipitated RNA was spun down at 9,000 ipm for 
15 min in a Sorvall SS34 rotor and washed in 70% ethanol. The pellet was redissolved iii 300 
fil of RNAse-free water and 35 buffer (Promega) 5 ^il DTT, 7 ^l RNAsin and 8 fil DNAse 
were added. The tube was incubated at 37 degrees C for 30 minutes to remove contaminating 

15 * genomic DNA, extracted once with phenol chloroform and re-precipitated with 1/10 volume 
of 3 M sodium acetate and 2 volumes of 100% ethanol. The RNA was spun down and placed 
in RNAse free water. RNA was stored at -80 degrees C. 

Methods: 

20 The quantitative expression of various clones was assessed using microtiter plates 

containing RNA samples from a variety of normal and pathology-derived cells, cell lines and 
tissues using real time quantitative PGR (RTQ PGR; TAQMAN^. RTQ PGR was performed 
on a Perkin-Elmer Biosystems ABI PRISM® 7700 Sequence Detection System. Various 
collections of samples are assembled on the plates, and referred to as Panel 1 (containing cells 

25 and cell lines from normal and cancer sources). Panel 2 (containing samples derived from 
tissues, in particular from surgical samples, from normal and cancer sources). Panel 3 
(containing samples derived from a wide variety of cancer sources) and Panel 3 (containing 
cells and cell lines from normal cells and cells related to inflammatory conditions). 

First, the RNA samples were normalized to constitutively expressed genes such as p- 

30 actin and GAPDH. RNA (-50 ng total or -1 ng polyA+) was converted to cDNA using the 
TAQMAN^ Reverse Transcription Reagents Kit (PE Biosystems, Foster City, CA; Catalog 
No. NS08-0234) and random hexamers according to the manufacturer's protocol. Reactions 
were performed in 20 ul and incubated for 30 min. at 48*^C. cDNA (5 ul) was then transferred 
to a separate plate for the TAQMAN® reaction using P-actin and GAPDH TAQMAN® 
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Assay Reagents (PE Biosystems; Catalog Nos, 43 1088 IE and 4310884E, respectively) and 
TAQMAN® universal PGR Master Mix (PE Biosystems; Catalog No. 4304447) according to 
the manufacturer's protocol. Reactions were performed in 25 ul using the following 
parameters: 2 min. at 50^C; 10 min. at 95°C; 15 sec. at 95^C/1 min. at 60^C (40 cycles). 
5 Results were recorded as CT values (cycle at which a given sample crosses a threshold level of 
fluorescence) using a log scale, with the difference in RNA concentration between a given 
sample and the sample with the lowest CT value being represented as 2 to the power of delta 
CT. The percent relative expression is then obtained by taking the reciprocal of this RNA 
difference and multiplying by 100. The average CT values obtained for fi-actin and GAPDH 

10 were used to normaUze RNA samples. The RNA sample generating the highest CT value 
required no further diluting^ while all other samples were diluted relative to this sample 
according to their P-actin /GAPDH average CT values. ^ 

Normalized RNA (5 ul) was converted to cDNA and analyzed via TAQMAN® using 
One Step RT-PCR Master Mix Reagents (PE Biosystems; Catalog No. 4309169) and gene- 

15 specific primers according to the manufacturer's instmctions. Probes and primers were 

designed for each assay according to Perkin Elmer Biosystem's Pruner Express Software 
package (version I for Apple Computer's Macintosh Power PC) or a similar algorithm using 
the target sequence as input. Default settings were used for reaction conditions and the 
following parameters were set before selecting primers: primer concentration = 250 nM, 

20 primer melting temperature (Tm) range = 58°-60*' C, primer optimal Tm = 59** C, maximum 
primer difference = 2*^ C, probe does not have 5' G, probe Tm must be 10** C greater than 
primer Tm, amplicon size 75 bp to 100 bp. The probes and primers selected (see below) were 
synthesized by Synthegen (Houston, TX, USA). Probes were double purified by HPLC to 
remove uncoupled dye and evaluated by mass spectroscopy to verify coupling of reporter and 

25 quencher dyes to the 5' and 3' ends of the probe, respectively. Their final concentrations 
were: forward and reverse primers, 900 nM each, and probe, 200nM. 

The Taqman oligonucleotide set Ag756 for NOV-1, NOV-2, and NOV-2b {i.e., 
10132038) include the forward probe and reverse oligomers shown below: 



30 



TABLE 36 



Primers 


Sequences 


TM 


ength 


Start 
osition 


Forward 


5'.GGAGCAGTTCCTCACTTATCG-3' (SEQ ID NO: 47) 


59 


21 


248 
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Probe 


TET-5*-TET.5'. 

TGATGACCAGACCTCAAGAAACACTCG-3*.TAMRA 
(SEQIDNO:48) 


68.6 


27 


272 


Reverse 


5'-CAGTTGCCATCTTTGTCTTCAT-3' (SEQ ID NO: 49) 


59.2 


22 


304 



The Taqman oligonucleotide set Ag756 for NOV-3a through NOV-3d (i.e., 18552586) 
include the forward probe and reverse oligomers shown below: 
TABLE 37 



Primers 


Sequences 


TM 


Length 


Start 
Position 


Forward 


5*-AATGCTGAGGTCAAGCTAGGT-3' (SEQ ID NO: 50) 


58.1 


21 


121 


Probe 


TET-5'-CTCCTTCTGAGGCTGACGAGGACCT-3'- 
TAMRA (SEQ ID NO: 5 1) 


69.3 


25 


149 


Reverse 


5'-CATTCTCTGTTCTGGAGGTGAA-3' (SEQ 
ID NO: 52) 


59.3 


22 


174 



The Taqman oligonucleotide set Ag756 for NOV-4a, NOV-4b, NOV-4c, NOV-4d, and 
NOV-4e (i.e., 10093S72) include the forward probe and reverse oligomers shown below: 
TABLE 38 



Primer 


Sequences 


Length 


Forward 


5'-GGACTCCTCGGGATGGAAAG-3' (SEQ ID NO: 
53) 


20 


Probe 


FAM-5'-CGGCCTTGGTCTCGGAGATCCC-3'- 
TAMRA (SEQ ID NO: 54) 


23 


Reverse 


5'-CTCCCCTGGTGCTGGAAATT-3' (SEQ ID NO: 
55) 


20 



10 PGR conditions: 

Normalized RNA from each tissue and each cell line was spotted in each well of a 96 
well PGR plate (Perkin Elmer Biosystems). PGR cocktails including two probes (a probe 
specific for the target clone and another gene-specific probe multiplexed with the target probe) 
were set up using IX TaqMan'^^ PGR Master Mix for the PE Biosystems 7700, with 5 mM 
15 MgG12, dNTPs (dA, G, G, U at 1:1:1:2 ratios), 0.25 U/ml AmpUTaq Gold™ (PE Biosystems), 

and 0.4 U/|il KNase inhibitor, and 0.25 U/^il reverse transcriptase. Reverse transcription was 
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perfonned at 48'' C for 30 minutes followed by amplification/PCR cycles as follows: 95"* C 10 
iriin, then 40 cycles of 95*^ C for 15 seconds, 60^ C for 1 minute. 



TABLE 39; NOV-1, NOV-2, NOV-2b Taqman Results 

5 •In panel 1 of the results, the following abbreviations are used: 



ca. 


= carcinoma. 


* 


= established from metastasis. 


met 


= metastasis, 


s cell var 


= small cell variant. 


non-s 


= non-sm =non-small, 


squam 


= squamous. 


pi. eff 


= pi effusion = pleural effusion. 


glio 


= glioma. 


astro 


= astrocytoma, and 


neiu'o 


= neuroblastoma. 



In panel 2 of the results, the following abbreviations are used: 
. Cca: Colon Cancer 
PCa: Prostate Cancer 
Lea: Limg Cancer 
RCC: Renal Cell Carcinoma 
UtCa: Uterine Cancer 
ThyCa: Thyroid Cancer 
BrCa: Breast Cancer 
HCC: Hepatic Cell Carcinoma . 
TCC: Transitional Cell Carcinoma of the bladder 
OvCa: Ovarian Cancer 
GaCa: Gastric Cancer 





Panel 1 














Run 1 














Run 2 




Panel 2 




Panel 3 




Tissue^Name 


ag756 
%Rel. 
Expn. 


g756 

% Rel. 
Expn. 


Tissue_Name 


ag75 

6% 

Rel. 

Exp 

n. 


Tissue_Name 


ag75 
6 

% 

Rel. 
Expn 


Endothelial 
cells 


0.0 


0.0 


Normal Colon 


78.5 


93768 Secondary 

Th1 anti-CD28/anti-CD3 


0 


Endothelial 
cells (treated) 


12.2 


54.7 


CCa 1 


1.0 


93769_Secondary 

Th2 anti-CD28/anti-CD3 


0 


Pancreas 


27.6 


5.4 


CCa1 
Margin 


7.9 


93770_Secondary 

Tr1 anti-CD28/anti-CD3 


0 


Pancreatic 
ca.CAPAN 2 


0.0 


0.0 


CCa 2 


3.7 


93573_Secondary 
Th1_resting day 4-6 in IL-2 


0 


Adrenal Gland 
(new lot*) 


9.3 


29.3 


CCa 2 
Margin 


15.2 


93572_Secondary 
Th2__rest(ng day 4-6 In IL-2 


0 


Thyroid 


8.0 


6.5 


CCa 3 


0.4 


93571_Secondary 
Tr1_resting day 4-6 in IL-2 


0 
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Salavary gland 


6.8 


19.9 


CCa3 
Margin 


35.6 


93568 jDrimaryThI anti- 
CD28/anti-CD3 


0 


Pituitary gland 


3.2 


7.8 


CCa4 


10.1 


93569_j>rimary Th2 anti- 
CD28/anti-CD3 


0 


Brain (fetal) 


3.4 


18.4 


CCa4 
Margin 


11.6 


93570_primary Tri anti- 
CD28/anti-CD3 


0 


Brain (whole) 


6.9 


27.4 


CCa5 
Metastasis 


7.2 


93565_primary Th1_resting 
dy 4-6 in IL-2 


0 


Brain 


2.5 


13.8 


CCaS 

Margin 
(Liver) 


52.9 


93566_j)rimary Th2_resting 
dy 4-6 in IL-2 


0 


Brain 

(cerebellum) 


2.0 


28.7 


CCa6 


2.5 


93567 primary Trl resting 
dy 4-6 in IL-2 


0 


Brain 

(hippocampus) 


3.8 


20.9 


CCa 6 

Margin 
(Lung) 


14.1 


93351_CD45RA CD4 
lymphocyte anti- 
CD28/anti-CD3 


0 


Brain 

(thalamus) 


3.0 


11.0 


Normal 
Prostate 


10.0 


93352_CD45RO CD4 
lymphocyte anti- 
CD28/anti-CD3 


0 


Cerebral 
Cortex 


7.0 


61.1 


PCa 1 


10.7 


93251_CD8 
Lymphocytes anti- 
CD28/anti-CD3 


0 


Spinal cord 


8.6 


27.0 


PCa 1 Margin 


37.6 


93353_chronic CD8 
Lymphocytes 2ry resting 
dy 4-6 in IL-2 


0 


CNS 

ca.(gIio/astro) 
U87-MG 


0.0 


0.0 


PCa 2 


100. 
0 


93574 chronic CD8 
Lymphocytes 2ry activated 
CD3/CD28 


0 


CNS 

ca.(glio/astro)U 
-HS-MG 


0.2 


0.0 


PCa 2 Margin 


89.5 


93354 CD4 none 


0 


ca.(astro)SW1 
783 


0.0 


0.0 


Normal Lung 


51.1 


93252 Secondary 

Th 1 /Th2/Tr1 _anti-CD95 

CH11 


0 


CNS ca.* 

^neuro* met 
)SK-N-AS 


0.0 


0.0 


LCa 1 
Metastasis 


1.0 


93103_LAK cells_restlng 


0 


CNS ca 
(astro)SF-539 


0.1 


0.0 


LCa 1 Margin 
(Muscle) 


11.3 


93788 LAK cells IL-2 


0 


CNS ca 
(astro)SNB-75 


0.3 


0.3 


LCa 2 


8.5 


93787 LAK cells IL-2+IL- 
12 


0 


CNSca. 
(Qlio)SNB-19 


0.1 


0.0 


LCa 2 Margin 


31.6 


93789_LAK ceHsJL-2+IFN 
gamma 


0 


CNS ca. 
(alio)U251 


0.0 


0.0 


LCa 3 


5.8 


93790 LAK cells IL-2+ IL- 
18 


0 


CNS ca. 
(alio^SF-295 


0.0 


0.0 


LCa 3 Margin 


28.3 


93104_LAK 

cells PMA/ionomycin and 
IL-18 


0 


Heart 


28.5 


77.9 


LCa 4 


1.6 


93578_NK Cells IL- 
2_resting 


0 


Skeletal 
Muscle (new 
lot*) 


16.3 


15.7 


LCa 5 


4.2 


93109_Mixed Lymphocyte 
Reaction_Two Way MLR 


0 


Bone marrow 


0.7 


0.9 


LCa 5 Margin 


29.5 


93110_Mixed Lymphocyte 
Reaction_Two Way MLR 


0 


Thymus 


1.1 


2.7 


Ocular 

Melanoma 

Metastasis 


15.9 


93111_Mixed Lymphocyte 
Reaction_Two Way MLR 


0 


Spleen 


0.9 


2.1 


Ocular 

Melanoma 

Margin 


38.7 


93112_Mononuclear Cells 
(PBMCs)_resting 


0 
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(Liver) 








Lymph node 


3.6 


10.2 


Melanoma 
Metastasis 


0.0 


93113 Mononuclear Cells 
(PBMCs)_PWM 


0 


Colorectal 


2.2 


11.4 


Melanoma 

Margin 

(Lung) 


32.3 


931 14 Mononuclear Cells 
(PBMCs)_PHA-L 


0 


Stomach 


11.8 


34.4 


Normal 
Kidney 


56.3 


93249 Ramos (B 
cell)_none 


0 


Small Intestine 


11.7 


18.7 


RCC 1 


71.2 


93250_Ramos (B 
cell) lonomycin 


0 


Colon 
ca.SW480 


0.0 


0.0 


RCC1 
Margin 


26.1 


93349_B 

lymphocytes_PWM 


0 


Colon ca.* 

(SW480 

met)SW620 


0.0 


0.0 


RCC 2 


63.7 


93350 B 

lymphoytes CD40L and IL- 
4 


0 


Colon ca.HT29 


0.0 


0.0 


RCC 2 
Margin 


28.3 


92665 EOL-1 

(Eosinophil)_dbcAMP 

differentiated 




Colon ca.HCT- 
116 


0.0 


0.0 


RCC 3 


37.1 


93248_EOL-1 

(Eosinophil)_dbcAMP/PmA 

ionomvcin 


0 


Colon 
ca.CaCo-2 


0.0 


0.0 


RCC 3 
Margin 


33.7 


93356_Dendritic 
Cells none 


0 


83219 CC Well 
to Mod Diff 
(OD03866) 


0.3 


1.7 


RCC 4 


5.7 


93355_Dendritic Cells^LPS 
100 no/ml 


0 


Colon ca.HCC- 
2998 


0.0 


0.0 


RCC 4 
Margin 


'18.2 


93775_Dendritic 
Cells anti-CD40 


0 


Gastric ca.* 
(liver met) NCI- 
N87 


0.0 


0.0 


RCC 5 


8.5 


93774_Monocytes_resting 


0 


Bladder 


16.4 


29.3 


RCC 5 
Margin 


5.5 


93776 Monocvtes LPS 50 
ng/ml 


0 


Trachea 


4.2 


12.7 


RCC 6 


1.0 


93581 Macronhaaes resti 

www l^^^iVlM^I Vl^l Iwl^4w9 1 wwU 

ng 


0 


Kidney 


5.3 


14.3 


RCC 6 
Margin 


18.7 


93582 Macroohaoes LPS 

100 ng/ml 


0 


Kidney (fetal) 


7.8 


24.3 


RCC 7 


6.0 


93098 HUVEC 
(Endothelial)_none 


0 


Renal ca. 786- 
0 


0.5 


2.4 


RCC 7 
Margin 


8.5 


93099 HUVEC 
(Endothelial) starved 


0 


Renal ca.A498 


0.2 


0.0 


RCC 8 


0.3 


93100_HUVEC 
fEndothelian IL-1h 




Renal ca.RXF 
393 


6.0 


18.6 


RCC 8 
Margin 


14.1 


93779_HUVEC 
(Endothelial)JFN gamma 


0.1 


Renal 
ca.ACHN 


15.0 


28.7 


RCC 9 


6.3 


93102 HUVEC 
(Endothelial)_TNF alpha + 
IFN oamma 


0 0 

w. w 


Renal ca.UO- 
31 


1.2 


4.2 


RCC 9 
Margin 


19.6 


93101_HUVEC 
(Endothelial) TNP aloha + 
IL4 


0.0 


Renal ca.TK- 
10 


10.3 


21.2 


Normal 
Uterus 


7.0 


93781_HUVEC 
fEndotheliaH IL-11 


0.0 


Liver 


12.9 


48.0 


UtCa 1 


46.0 


93583__Lung Microvascular 
Endothelial Cells none 


37.1 


Liver (fetal) 


4.7 


17.7 


Normal 
Thyroid 


6.1 


93584_Lung Microvascular 
Endothelial Cells_TNFa (4 
ng/ml) and ILIb (1 ng/ml) 


12.9 


Liver ca, 
(hepatobtast) 


0.0 


0.0 


ThyCa 1 


6.1 


92662_MicrovascuIar 
Dermal endothelium none 


81.2 
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HepG2 














1 1 inn 


7 4 


25.4 


ThyCa 2 


0.8 


92663_Microsvasular 
Dermal endothelium TNFa 
(4 ng/ml) and IL1b(1 
ng/ml) 


38.7 


Luna ^fetaH 


9.6 


19.0 


ThyCa 2 
Margin 


22.5 


93773_Bronchlal 
epithelium TNFa (4 ng/ml) 
and ILIb (1 ng/mr)** 


0.0 


Lung ca. (small 
cein LX-1 


0.0 


0.0 


Normal 
Breast 


12.1 


93347_Small Airway 
Epithelium_none 


0.0 


Lung ca. (small 

wdly iNV-zi ri\Ji7 


0 0 


0.0 


BrCa 1 


7.5 


93348_Small Airway 
Epithelium TNFa (4 ng/ml) 
and ILIb (1 ng/ml) 


0.5 


Lung ca. (s.cel! 
var ^ SHP-77 


0.0 


0.0 


BrCa 2 


4.0 


92668_Coronery Artery 
SMC resting 


0.6 


1 linn ca /l9rnf> 

celI)NCI-H460 


0.1 


0.0 


BrCa 3 
Metastasis 


15.1 


92669 Coronery Artery 
SMC TNFa (4 ng/ml) and 
ILIbd ng/ml) 


0.1 


1 linn r*si /nnn- 

sm. cell) A549 


0.2 


0.0 


BrCa 4 
Metastasis 


18.4 


931 07__astrocytes_resting 


29.0 


1 linn f*st ( nnn- 

s.cell) NCI-H23 


0.4 


2.4 


BrCa 5 


11.7 


93108 astrocytes TNFa (4 
ng/ml) and ILIb (1 ng/ml) 


17.0 


Lung ca (non- 


1 3 


0.8 


BrCa 6 


3.1 


92666_KU-812 
(Basophil) resting 


0.0 


Lung ca. (non- 
s.cl) NCI-H522 


100.0 


100.0 


BrCa 6 
Margin 


5.3 


92667_KU-812 
(Basophil)_PMA/ionoyc!n 


0.0 


1 1 inn 

Luny 

(squam.) SW 
Qno 


0 7 


0.8 


BrCa 7 


6.8 


93579_CCD1106 
(Keratinocytes) none 


0.0 


Lung ca. 
(squam.) NCI- 


0 0 


0.0 


BrCa 7 
Maroin 


11.7 


93580_CCD1106 
(Keratinocytes) TNFa and 
IFNa ** 


4.1 


Mammary 
gland 


5.0 


7.5 


Normal Liver 


37.1 


93791 Liver Cirrhosis 


8.1 


effusion) 

MnF-7 

IVIwi 1 


2 7 


12.0 


HCC 1 


47.0 


93792 Lupus Kidney 


18.1 


Breast ca.* 
MB-231 


0.0 


0.0 


HCC2 


34.2 


93577 NCI-H292 


1.9 


effusion) T47D 


0.2 


0.0 


HCC 3 


5.2 


93358 NCI-H292 IL-4 


4.6 


Di cool Oa.D 1 

549 


0.0 


0.0 


HCC 4 


27.6 


93360 NCI-H292 IL-9 


0.9 


MDA-N 


0.0 


0.0 


HCC 4 
Margin 


3.6 


93359 NCI-H292 IL-13 


1.7 


Ovary 


5.5 


18.4 


HCC 5 


5.3 


93357 NCI-H292_IFN 
gamma 


4.6 


(^\/arian 

V.^VCll ICll 1 

ca.OVCAR-3 


11.8 


21.2 


HCC 5 
Margin 


15.9 


93777 HPAEC - 


0.0 


Ovarian 

ra 0\/CAR-4 


4 6 


12.5 


Normal 
Bladder 


27.0 


93778_HPAECJL-1 
beta/TNA alpha 


0.0 


Ovarian 

ca OVCAR-S 


0 2 


0.0 


TCC 1 


2.1 


93254_Normal Human 
Lung Fibroblast none 


0.3 


Ovarian 
ca.OVCAR-8 


4.5 


21.5 


TCC2 


1.1 


93253_Normal Human 
Lung Fibroblast_TNFa (4 
ng/ml) and IL-lb (1 ng/ml) 


0.8 


Ovarian 
ca.lGROV-1 


4.3 


5.4 


TCC 3 


2.1 


93257_Normal Human 
Lung Fibroblast_IL-4 


0.5 


Ovarian ca.* 


50.0 


92.7 


TCC 3 


52.1 


93256 Nonmal Human 


0.3 



172 



BNSDOCID: <WO 0162928A2_L> 



wo 01/62928 PCT/USOl/06151 



(ascites) SK- 
OV-3 






Margin 




Lung Fibroblast_!L-9 




Litems 


7.6 


24.2 


Normal 
Ovary 


7.7 


93255_Normal Human 
Lung FibroblastJL-13 


1.7 


Plancenta 


17.0 


31.4 


OvCal 


89.5 


93258_Normal Human 
Lung Fibroblast J FN 
gamma 


10.2 


Prostate 


5.3 


15.5 


OvCa 2 


45.1 


93106_Dermal Fibroblasts 
CCD1070_resting 


0.0 


Prostate ca * 
(bone met)PC- 
3 


14,7 


42.6 


OvCa2 
Margin 


8.7 


93361 Dermal Fibroblasts 

CCD1070__TNFa!pha4 

ng/ml 


0.3 


Testis 


10,2 


13.1 


Normal 
Stomach 


25.7 


93105 Dermal Fibroblasts 
CCD1070JL-1 beta 1 
ng/ml 


0.0 


Melanoma 
Hs688(A)T 


0.0 


0.0 


Normal 
Stomach 


15.6 


93772_dermal 
fibroblast_IFN gamma 


0.8 


Melanoma* 
(met) 

Hs688(B)T 


0.1 


0.0 


GaCa 1 


26.6 


93771 dermal 
fibroblast IL-4 


as 


MelanomaUAC 
C-62 


0.2 


0.0 


GaCa 1 
Margin 


31.0 


93259 IBD Colitis 1** 


19.3 


Melanoma 
M14 


0.0 


0.0 


GaCa 2 


15.4 


93260 IBD Colitis 2 


6.1 


Melanoma 
LOX IMVI 


0.0 


0.0 


GaCa 2 
Margin 


5.2 


93261 IBDCrohns 


3.7 


Melanoma* 

(met)SK-MEL- 

5 


0.1 


0.0 


GaCa 3 


13.7 


735010 Colon normal 


26.1 


Adipose 


2.5 


29,9 






73501 9_Lung_none 


90.1 












64028- 1 __Thy mus_none 


100. 
0 












64030-1_Kidney_none 


16.0 



TABLE 40 : NOV-3a, NOV-3b, NOV-3c Taqman Results 





Panel 
1 




Panel 
2 




Panel 
3 


Tissue_Name 


Ages 
4 

%ReL 
Expn. 


Tissue_Name 


ag664 
%Rel. 
Expn. 


Tissue_Name 


ag664 
%Rel. 
Expn. 


Liver adenocarcinoma 


13.6 


Normal Colon 


70.2 


93768 Secondary Th1 anti- 
CD28/antl-CD3 


16.4 


Heart (fetal) 


6.5 


CCa 1 


22.7 


93769 Secondary Th2 anti- 
CD28/anti-CD3 


12.9 


Pancreas 


6.4 


CCa 1 Margin 


9.0 


93770 Secondary Tri anti- 
CD28/anti-CD3 


18.3 


Pancreatic ca. 
CAPAN 2 


1,6 


CCa 2 


14.0 


93573 Secondary Thi resting day 
4-6 in IL-2 


22.1 


Adrenal gland 


10.5 


CCa 2 Margin 


6.5 


93572__Secondary Th2 resting day 
4-6 In IL-2 


13.1 


Thyroid 


5.6 


CCa 3 


42.6 


93571 Secondary Tri resting day 
4-6 in IL-2 


23.0 


Salivary gland 


4.8 


CCa 3 Margin 


20.2 


93568 ^primary Thi anti- 


11.5 
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CD28/anti-CD3 




Pituitary gland 


14.3 


CCa4 


27.6 


93569_primafy Th2 anti- 
CD28/anti-CD3 


15.9 


Brain (fetal) 


27.6 


CCa 4 Margin 


10.2 


93570_prlmary Tr1 anti-CD28/anti- 
CD3 


16.5 


Brain (whole) 


22.5 


CCa5 
Metastasis 


38.4 


93565 primary Th1 resting dy 4-6 
in IL-2 


73.7 


Brain (annygciala) 


22.7 


CCa 5 Margin 
(Liver) 


7.3 


93566_primary Th2 resting dy 4-6 
in IL-2 


47.0 


Brain (cerebellum) 


13.0 


CCa 6 


34.4 


93567_primary Tr1 resting dy 4-6 
in IL-2 


26.4 


Brain (hippocannpus) 


100.0 


CCa 6 Margin 
(Lung) 


5.9 


93351 _CD45RA CD4 
lymphocyte anti-CD28/anti-CD3 


8.5 


Brain (thalamus) 


22.4 


Normal Prostate 


20.7 


93352_CD45RO CD4 

lymphocyte_anti-CD28/anti-CD3 


19.3 


Cerebral Cortex 


24.3 


PCa 1 


26.6 


93251 CD8 Lymphocytes anti- 
CD28/anti-CD3 


8.0 


Spina! cord 


22.7 


PCa 1 Margin 


32.8 


93353_ch.'-on!C CDS Lymphocytes 
2ry_resting dy 4-6 In IL-2 


9.9 


glio/astro U87-MG 


2.8 


PCa 2 


47.3 


93574_chronic CD8 Lymphocytes 
2ry_activated CD3/CD28 


6.7 


gllo/astro U-118-MG 


22.7 


PCa 2 Margin 


36.9 


93354 CD4 none 


17.4 


astro SW1783 


5.4 


Normal Lung 


100.0 


93252 Secondary 
Th1/Th2/Tr1 anti-CD95 CH1 1 


20.7 


neuro; met SK-N-AS 


26.8 


LCa 1 Metastasis 


12.5 


93103_LAK cells_resting 


20.5 


astro SF-539 


12.8 


LCa 1 Margin 
(Muscle) 


3.8 


93788 LAK cells IL-2 


19.3 


astro SNB-75 


5.4 


LCa 2 


24.2 


93787 LAK cells IL-2+IL-12 


6.8 


glio SNB-19 


7.4 


LCa 2 Margin 


40.9 


93789 LAK cells_IL-2+IFN gamma 


16.0 


glio U251 


4.0 


LCa 3 


13.6 


93790_LAK cellsJL-2+ IL-18 


24.2 


glio SF-295 


4.5 


LCa 3 Margin 


7.8 


93104 LAK cells PMA/ionomycin 
and IL-18 


1.5 


Heart 


2.4 


LCa 4 


10.4 


93578_NK Cells IL-2_resting 


18.7 


Skeletal muscle 


0.9 


LCa 5 


32.3 


93109_Mixed Lymphocyte 
Reaction_Two Way MLR 


23.7 


Bone marrow 


17.0 


LCa 5 Margin 


12.1 


93110_Mixed Lymphocyte 
Reaction_Two Way MLR 


5.8 


Thymus 


20.3 


Ocular 

Melanoma 

Metastasis 


6.8 


93111_Mixed Lymphocyte 
Reaction_Two Way MLR 


10.2 


Spleen 


25.4 


Ocular 
Melanoma 
Margin (Liver) 


- 

8.0 


93112 Mononuclear Cells 
(PBMCs)_resting 


8.6 


Lymph node 


29.9 


Melanoma 
Metastasis 


18.2 


93113 Mononuclear Cells 
(PBMCs)_PWM 


24.5 


Colorectal 


15.9 


Melanoma 
Margin (Lung) 


16.4 


93114 Mononuclear Cells 
(PBMCs)_PHA-L 


18.6 


Stomach 


24.8 


Normal Kidney 


40.9 


93249_Ramos (B cell)_none 


5.2 


Small intestine 


14.4 


RCC 1 


32.8 


93250_Ramos (B cell)_ionomycin 


17.8 


Colon SW480 


4.9 


RCC 1 Margin 


30.6 


93349_B lymphocytes_PWM 


26.2 


Colon SW620(SW480 
met) 


9.3 


RCC 2 


63.3 


93350 BIymphoytes CD40Land 
IL-4 


30.6 


Colon HT29 


6.6 


RCC 2 Margin 


9.7 


92665_EOL-1 

(Eosinophil) dbcAMP differentiated 


9.7 


Colon HCT-116 


3.2 


RCC 3 


31.2 


93248_EOL-1 

(Eosinophil)_dbcAMP/PmAionomyc 
in 


22.2 


Colon CaCo-2 


3.7 


RCC 3 Margin 


18.6 


93356_Dendritic Cells_none 


12.1 


Colon Ca 
tissue(OD03866) 


18.7 


RCC 4 


4.5 


93355_Dendritic Cells_LPS 100 
ng/ml 


20.0 
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Colon HCC-2998 


32.5 


RCC 4 Margin 


12.2 


93775 Dendiltic Ceils anti-CO40 


17.7 


Gastric(liver met) NCI- 
N87 


11.0 


RCC5 


11.6 


93774_Monocytes_resting 


22.4 


Bladder 


6.9 


RCC 5 Margin 


3.9 


93776_Monocytes_LPS 50 ng/ml 


2.0 


Trachea 


35.6 


RCC 6 


15.8 


93581_Macrophages_resting 


11.8 


Kidney 


6.4 


RCC 6 Margin 


14.6 


93582_Macrophages_LPS 100 
ng/ml 


3.9 


Kidney (fetal) 


13.7 


RCC 7 


9.2 


93098 HUVEC (Endothelial) none 


6.0 


Renal 786-0 


0.0 


RCC 7 Margin 


5.8 


93099_HUVEC 
(Endothelial) starved 


15.0 


Renal A498 


20.7 


RCC 8 


20.9 


93100 HUVEC (Endothelial) IL-lb 


5.3 


Renal RXF 393 


1.5 


RCC 8 Margin 


10.7 


93779_HUVEC (Endothellal)JFN 
gamma 


13.2 


Renal ACHN 


1.8 


RCC 9 


23.0 


93102_HUVEC (Endothelial)_TNF 
alpha + IFN gamma 


9.0 


Renal UO-31 


1.6 


RCC 9 Margin 


21.0 


93101_HUVEC (Endothelial)_TNF 
alpha + IL4 


5.3 


Renal TK-10 


2.3 


Normal Uterus 


5.5 


93781_HUVEC (Endothelial)JL-1 1 


5.5 


Liver 


10.5 


UtCal 


31.9 


93583_Lung Microvascular 
Endothelial Cells none 


7.3 


Liver (fetal) 


21.8 


Normal Thyroid 


13.8 


93584_Lung Microvascular 
Endothelial Cells TNFa (4 ng/ml) 

and lL1b (1 ng/ml) 


6.5 


Liver (hepatoblast) 
HepG2 


14.1 


ThyCa 1 


6.3 


92662_Microvascular Dermal 
endothelium none 


8.0 


Lung 


48.6 


ThyCa 2 


7.9 


92663_Microsvasular Dermal 
endothelium TNFa (4 ng/ml) and 
ILIb (1 ng/ml) 


10.2 


Lung (fetal) 


24.3 


ThyCa 2 Margin 


7.0 


93773 Bronchial epithelium TNFa 
(4 ng/ml) and ILIb (1 ng/ml) ** 


12.2 


Lung (small cell) LX-1 


4.7 


Normal Breast 


36.9 


93347_Small Airway 
Epithelium_none 


4.9 


Lung (small cell) NCI- 
H69 


1.1 


BrCa 1 


10.7 


93348_Small Airway 
Epithelium TNFa (4 ng/ml) and 
IL1b(1 ng/ml) 


32.5 


Lung (s.cell var.) 
SHP-77 


24.5 


BrCa2 


11.2 


92668_Coronery Artery 
SMC_restlng 


2.4 


Lung (large cell)NCI- 
H460 


1.6 


BrCa 3 
Metastasis 


32.8 


92669 Coronery Artery 
SMC_TNFa (4 ng/ml) and ILIb (1 
ng/ml) 


0.0 


Lung (non-sm. cell) 
A549 


1.4 


BrCa 4 
Metastasis 


13.7 


931 07_astrocytes_resting 


4.4 


Lung (non-s.cell) NCI- 
H23 


10.7 


BrCa 5 


19.8 


93108 astrocytes TNFa (4 ng/ml) 
and ILIb (1 ng/ml) 


4.2 


Lung (non-s.cell) 
HOP-62 


32.3 


BrCa 6 


29.1 


92666 KU-81 2 (Basophil) resting 


2.5 


Lung (non-s.cl) NCI- 
H522 


1.7 


BrCa 6 Margin 


17.2 


92667_KU-812 
(Basophil) PMA/ionoycin 


7.2 


Lung (squam.) SW 
900 


3.6 


BrCa 7 


13.9 


93579_CCD1106 
(Keratinocytes) none 


6.6 


Lung (squam.) NCI- 
H596 


0.9 


BrCa 7 Margin 


25.5 


93580_CCD1106 
(Keratinocytes)_TNFa and IFNg ** 


3.0 


Mammary gland 


27.2 


Normal Liver 


8.3 


93791 Liver Cirrhosis 


5.4 


Breast (pi.ef) MCF-7 


12.4 


HCC1 


14.1 


93792_Lupus Kidney 


2.2 


Breast (pi.ef) MDA- 
MB-231 


18.7 


HCC2 


14.5 


93577 NCI-H292 


41.5 


Breast (pi.ef) T47D 


0.4 


HCC3 


7.9 


93358 NCI-H292 IL-4 


62.4 


Breast BT-549 


28.9 


HCC4 


16.0 


93360 NCI-H292 IL-9 


53.2 


Breast MDA-N 


20.6 


HCC 4 Margin 


19.3 


93359_NCI-H292 IL-13 


21.3 


Ovary 


12.2 


HCC5 


4.0 


93357_NCI-H292 IFN gamma 


19.9 
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Ovarian OVCAR-3 


2.1 


HOC 5 Margin 


2.0 


93777 HPAEC - 


6.5 


Ovarian 0\/r!AR-4 

Valval 1011 \j If \^r^r\^^ 


0.5 


Normal Bladder 


44.4 


93778_HPAEGJL-1 betaH'NA 
alpha 


9.7 


Ovarian OVCAR-S 


0.6 


TOO 1 


24.5 


93254_Normal Human Lung 
Fibroblast none 


2.2 


Ovarian OVCAR-8 


6.5 


TOG 2 


16.4 


93253_Normal Human Lung 
Fibroblast TNFa (4 ng/ml) and IL- 
1b(1ng/ml) 


3.0 


Ovarian IGROV-1 


3,5 


TOG 3 


22.7 


93257 Normal Human Lung 
Fibroblast IL-4 


4.0 


Ovarian ^a^nit^*?^ SK- 
OV-3 


2.3 


TGG 3 Margin 


13.4 


93256 Normal Human Lung 
Fibroblast IL-9 


3.2 


Uterus 


17.2 


Normal Ovary 


12.7 


93255 Normal Human Lung 
Fibroblast IL-13 


4.8 


Plancenta 


12.9 


OvCa 1 


23.3 


93258 Normal Human Lung 
FibroblastJFN gamma 


4.0 


Prostate 


8.5 


OvGa 2 


72.2 


931 06_Dermal Fibroblasts 

CCD1070_resting 


1.5 


Prostate /bone 
met)PC-3 


3.3 


OvGa 2 Margin 


4.1 


93361 Dermal Fibroblasts 
CCD1070_TNF alpha 4 ng/ml 


42.6 


Testis 


4.1 


Normal Stomach 


20.2 


93105 Dermal Fibroblasts 
GCD1070JL-1 beta 1 ng/ml 


7.3 


Melanoma 
Hs688(A).T 


0.6 


Normal Stomach 


5.2 


93772_dermal fibroblastJFN 
gamma 


4.3 


l\/lelanoma (met) 
Hs688fB^ T 


0.5 


GaCa 1 


8.4 


93771 dermal fibroblast IL-4 


io.4 


Melanoma UACC-62 


1.6 


GaCa 1 Margin 


15.9 


93259 IBD Golitis 1** 


3.2 


Melanoma M14 


0.6 


GaCa 2 


38.4 


93260 IBD Colitis 2 


0.0 


Melanoma LOX IMVI 


1.7 


GaCa 2 Margin 


4.5 


93261 IBDCrohns 


0.0 


Melanoma (met) SK- 
MEL-5 


6.2 


GaGa3 


55.5 


735010 Colon normal 


23.0 


Adipose 


6.0 






7350 1 9_Lung_none 


6.4 










64028- 1 __Thy mus__none 


21.2 










64030-1_Kidney_none 


100.0 



TABLE 41 : NOV-4a, NOV-4b, NOV-4c, NOV-4d, and NOV-4e Taqman results 



Tissue_Name 


Panel 1 


Tissue_Name 


Panel 2 


ag538 
% R0I. 
expn. 


ag538 
% ReL 
expn. 


Adipose 


12.6 


Normal Colon GENPAK 
061003 


9.7 


Adrenal gland 


19.9 


83219 CC Well to Mod Diff 
(OD03866) 


4.3 


Bladder 


100.0 


83220 CC NAT (OD03866) 


3.3 


Bone marrow 


4.8 


83221 CC Gr.2 rectosigmoid 
(OD03868) 


2.9 


Endothelial cells 


0.0 


83222 CC NAT (OD03868) 


2.1 


Endothelial cells 
(treated) 


4.5 


83235 CC Mod Diff 
(ODO3920) 


8.0 


Liver 


9.3 


83236 CC NAT (ODO3920) 


4.6 


Liver (fetal) 


4.1 


83237 CC Gr.2 ascend colon 
(OD03921) 


3.4 


Spleen 


4.4 


83238 CC NAT (0D03921) 


2.4 


Thymus 


2.3 


83241 CC from Partial 
Hepatectomy (ODO4309) 


2.8 
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Thyroid 


14.0 


83242 Liver NAT (ODO4309) 


4.5 


Trachea 


7,6 


87472 Colon mets to lung 
(OD04451-01) 


7.0 


Testis 


10.4 


87473 Luna NAT (OD04451- 
02) 


17.2 


Spinal cord 


8.7 


Normal Prostate Clontech A+ 
6546-1 


6.2 


Salavary gland 


13.7 


84140 Prostate Cancer 
(OD04410) 


13.0 


Brain (amygdala) 


0.2 


84141 Prostate NAT 
(OD04410) 


100.0 


Brain (cerebellum) 


0.8 


87073 Prostate Cancer 
(OD04720-01) 


20.2 


Brain (hippocampus) 


1.2 


87074 Prostate NAT 
(OD04720-02) 


6.0 


Brain (substantia nigra) 


7.9 


Normal Lung GENPAK 
061010 


2.7 


Brain (thalamus) 


1.2 


83239 Luna Met to Muscle 
(OD04286) 


0.5 


Cerebral Cortex 


1.0 


83240 Muscle NAT 
(OD04286) 


9.8 


Brain (whole) 


0.4 


84136 Lung Malignant 
Cancer (OD03 126) 


2.0 


Brain (fetal) 


0.1 


84137 Lung NAT (OD03126) 


3.1 


CNSca. (glio/astro) 
U-118-MG 


1.3 


84871 Lung Cancer 
(OD04404) 


2.0 


CNS ca. (astro)SF-539 


0.4 


84872 Lung NAT (OD04404) 


13.2 


CNS ca, (astro) SNB- 
75 


1.0 


84875 Luna Cancer 
(OD04565) 


9.8 


CNS ca. (astro) 
SW1783 


4.7 


85950 Luna Cancer 
(OD04237-01) 


4.2 


CNS ca. (glio) U251 


0.0 


85970 Lung NAT (OD04237- 
02) 


13.3 


CNS ca. (glio) SF-295 


2.1 


83255 Ocular Mel Met to 
Liver (ODO4310) 


0.7 


CNS ca. (gno)SNB-19 


0.0 


83256 Liver NAT (ODO4310) 


8.7 


CNS ca. 

(glio/astro)U87-MG 


0.0 


84139 Melanoma Mets to 
Lung (OD04321) 


1.2 


CNS ca.* (neuro; met ) 
SK-N-AS 


0.1 


84138 Lung NAT (OD04321) 


6.0 


Small intestine 


31.4 


Normal Kidnev GENPAK 
061008 


7.5 


Colorectal 


29.7 


83786 Kidnev Ca Nuclear 
grade 2 (OD04338) 


8.8 


Colon ca. HT29 


0.2 


83787 Kidney NAT 
(OD04338) 


16.5 


Colon ca.CaCo-2 


0.0 


83788 Kidnev Ca Nuclear 
grade 1/2 (OD04339) 


3.9 


Colon ca.HCT-1 5 


0.4 


83789 Kidnev NAT 
(OD04339) 


6.9 


Colon ca.HCT-118 


0.0 


83790 Kidnev Ca Clear cell 
type (OD04340) 


8.0 


Colon ca. HCC-2998 


0.8 


83791 Kidnev NAT 
{OD04340) 


8.8 


Colon ca. SW480 


0.3 


83792 Kidney Ca. Nuclear 
grade 3 (OD04348) 


3.9 


Colon ca.* (SW480 
met)SW620 


0.0 


83793 Kidney NAT 
(OD04348) 


13.3 


Fetal Skeletal 


16.5 


87474 Kidney Cancer 
(OD04622-01) 


5.2 
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Skeletal muscle 


20.9 


87475 KidnevNAT 
(OD04622-03) 


9.1 


Heart 


33.9 


85973 Kirinpv Cancer 
(OD04450-01) 


4.4 


Stomach 


19.8 


85974 Kidney NAT 
(OD04450-03) 


11.3 


Gastric ca * Hiver met) 
NCI-N87 


2.2 


Kidney Cancer Clontech 
8120607 


2.1 


Kidney 


15.8 


Kidney NAT Clontec!^ 
8120608 


5.0 


Kidney (fetal) 


8.1 


Kidnev Cancer Clontech 
8120613 


0.1 


Renal ca. 786-0 


3.0 


Kidnev NAT Clontech 
8120614 


3.6 


Renal ca. A498 


3.9 


Kidnev Cancer Clontech 
9010320 


6.5 


Renal ca.ACHN 


97.3 


Kidnev NAT Clontech 

9010321 


5.6 


Renal ca.TK-10 


0.4 


Normal Uterus GENPAK 
061018 


8.9 


Renal ca.UO-31 


10.4 


Uterus Cancer GENPAK 
064011 


6.1 


Renal ca. RXF 393 


6.4 


Normal Thvroid Clontech A+ 
6570-1" 


2.3 


Pancreas 


13.1 


Thvmid Cancer (3ENPAK 
064010 


1.0 


2 


0.1 


Thvroid Cancer 
INVITROGEN A302152 


10.2 


Ovary 


23.8 


Thyroid NAT INVITROGEN 
A302153 


6.5 


Ovarian ca.lGROV-1 


0.0 


Normal Breast GENPAK 
061019 


8.1 


Ovarian ca.OVCAR-3 


26.6 


84877 Breast Cancer 
(OD04566) 


6.0 


Ovarian ca.OVCAR-4 


1.4 


85975 Breast Cancer 
(OD04590-01) 


8.0 


Ovarian ca.OVCAR-5 


3.4 


R5Q76 Brpa^t Cancer Mets 
(OD04590-03) 


7.2 


Ovarian ca.OVCAR-8 


0.0 


87070 Breast Cancer 
Metastasis (OD04655-05) 


2.2 


Ovarian ca.* (ascites) 
SK-Ov-3 


0.0 


GENPAK Breast Cancer 

064006 


19.2 


Prostate 


56.3 


Breast Cancer Clontech 
9100266 


4.0 


1 luoiciic Oct. ^LJwi ic; 

met)PC-3 


0.0 


Breast NAT Clontech 
9100265 


6.6 


Plancenta 


66.0 


Breast Cancer INVITROGEN 
A209073 


4.7 


Pituitary gland 


4.5 


Breast NAT INVITROGEN 
A2090734 


9.0 


Uterus 


22.4 


Normal Liver GENPAK 
061009 


4.6 






Liver Cancer GENPAK 
064003 


1.1 






Liver Cancer Research 
Genetics RNA1025 


4.5 






Liver Cancer Research 
Genetics RNA 1026 


4.6 
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• 


Paired Liver Cancer Tissue 
Research Genetics RNA 
6004-T 


3.9 






Paired Liver Tissue Research 
Genetics RNA 6004-N 


3.6 






Paired Liver Cancer Tissue 
Research Genetics RNA 
6005-T 


5.4 






Paired Liver Tissue Research 
Genetics RNA 6005-N 


5.1 






Normai Bladder GENPAK 
061001 


10.4 






Bladder Cancer Research 
Genetics RNA 1023 


5.7 






Bladder Cancer 
INVITROGEN A302173 


2.5 






87071 Bladder Cancer 
(OD04718-01) 


4.9 






87072 Bladder Normal 
Adjacent (OD04718-03) 


11.4 






Normal Ovary Res. Gen. 


3.8 






Ovarian Cancer GENPAK • 
064008 


19.1 






87492 Ovary Cancer 
(OD04768-07) 


2.1 






87493 Ovary NAT 
(OD04768-08) 


23,8 






Normal Stomach GENPAK 
061017 


12.3 






NAT Stomach Clontech 
9060359 


12.2 






Gastric Cancer Clontech 
9060395 


8.1 






NAT Stomach Clontech 
9060394 


18.3 






Gastric Cancer Clontech 
9060397 


7.7 






NAT Stomach Clontech 
9060396 


8.5 






Gastric Cancer GENPAK 
064005 


15.4 
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The Taqman results are summarized in Table 42. 



TABLE 42 



NOVX 


Internal Accession 
Number 


Results 


NOV-1 


10132038.0.67 


Nomial adjacent tissue to colon cancer tissue 
showed a higher expression of the gene as compared 
to colon cancer tissue itself. The results also 
demonstrate a similar profile for lung and ocular 
melanoma. 


NOV-2a 


10l3203o.0.1iy 




lUl3ZUJo.U.l JO 


NOV-3a 


18552586_EXT1 


High level of expression in brain and moderate 
expression in lung and trachea, suggesting its 
potential role in diseases involving these tissues. 
Increased expression in normal colon as compared 
to colon cancer tissue. Cancerous uterus and ovary 
tissues exhibited significatnly higher expression 
than their normal counterparts. 


NOV-3b 


18552586_EXT2 


NOV-3C 


1S552586_EXT3 


NOV-3d 


18552586_EXT4 


NOV-4a 


10093872.0.107 


Increased expression in normal bladder and 


NOV-4b 


10093872.1 


moderate expression in prostate, heart, placenta, 
small intestine, and colorectal cells. Normal 
adjacent tissue (NAT) of prostate showed maximum 
expression. 


NOV-4C 


10093872.0.38 


NOV-4d 


10093872.2 


NOV-4e 


10093872.3 



5 OTHER EMBODIMENTS 

While the invention has been described in conjunction with the detailed description 
thereof, the foregoing description is intended to illustrate and not limit the scope of the 
invention, which is defined by the scope of the appended claims. Other aspects, advantages, 
and modifications are within the scope of the following claims. 
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What is claimed is: 

1 . An isolated polypeptide comprising an amino acid sequence selected from the 
group consisting of: 

a) a mature form of the amino acid sequence selected from the group consisting of 
SEQ ID NO: 2, 4, 5, 7, 9, 11, 13, 15, 17, 19, 21, or 23; 

b) a variant of a mature form of the amino acid sequence selected from the group 
consisting of SEQ ID NO: 2, 4, 5, 7, 9, 11,13, 15, 17, 19, 21, or 23, wherein 
any amino acid in the mature form is changed to a different amino acid, 
provided that no more than 15% of the amino acid residues in the sequence of 
the mature form are so changed; 

c) the amino acid sequence selected from the group consisting of SEQ ID NO: 2, 
4, 5, 7, 9, 11, 13, 15, 17, 19, 21, or 23; 

d) a variant of the amino acid sequence selected from the group consisting of SEQ 
ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23 wherein any amino acid 
specified in the chosen sequence is changed to a different amino acid, provided 
that no more than 1 5% of the amino acid residues in the sequence are so 
changed; and 

e) a fragment of any of a) through d). 

2. The polypeptide of claim 1 that is a naturally occurring allelic variant of the sequence 
selected from the ^oup consisting of SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 
or 23. 

3. The polypeptide of claim 2, wherein the variant is the translation of a single nucleotide 
polymorphism. 

4. The polypeptide of claim 1 that is a variant polypeptide described therein, wherein any 
amino acid specified in the chosen sequence is changed to provide a conservative 
substitution. 

5. An isolated nucleic acid molecule comprising a nucleic acid sequence encoding a 
polypeptide comprising an amino acid sequence selected from the group consisting of: 
a) a mature form of the amino acid sequence given SEQ ID NO: 2, 4, 5, 7, 9, 1 1, 

13, 15, 17, 19, 21, or 23; 

181 



wo 



01/62928 



PCT/USOl/06151 



b) a variant of a mature fonn of the amino acid sequence selected from the group 
consisting of SEQ ID NO: 2, 4, 5, 7, 9, 11, 13, 15, 17, 19, 21, or 23 wherein 
any amino acid in the mature form of the chosen sequence is changed to a 
different amino acid, provided that no more than 15% of the.amino acid 
residues in the sequence of the mature form are so changed; 

c) the amino acid sequence selected from the group consisting of SEQ ID NO: 2, 
4, 5, 7, 9, 11, 13, 15, 17, 19, 21, or 23; 

d) a variant of the amino acid sequence selected from the group consisting of SEQ 
ID NO: 2, 4, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, or 23, in which any amino acid 
specified in the chosen sequence is changed to a different amino acid, provided 
that no more than 15% of the amino acid residues in the sequence are so ^ 
changed; 

e) a nucleic acid fragment encoding at least a portion of a polypeptide comprising 
the amino acid sequence selected from the group consisting of SEQ ED NO: 2, 
4,5,7, 9,11, 13, 15, 17, 19, 21, or 23 or any variant of said polypeptide 
wherein any amino acid of the chosen sequence is changed to a different amino 
acid, provided that no more than 10% of the amino acid residues in the 
sequence are so changed; and 

f) the complement of any of said nucleic acid molecules. 

6. The nucleic acid molecule of claim 5, wherein the nucleic acid molecule comprises the 
nucleotide sequence of a naturally occurring allelic nucleic acid variant. 

7. The nucleic acid molecule of claim 5 that encodes a variant polypeptide, wherein the 
variant polypeptide has the polypeptide sequence of a naturally occurring polypeptide 
variant. 

8. The nucleic acid molecule of claim 5, wherein the nucleic acid molecule comprises a 
single nucleotide polymorphism encoding said variant polypeptide. 

9. The nucleic acid molecule of claim 5, wherein said nucleic acid molecule comprises a 
nucleotide sequence selected from the group consisting of 

a) the nucleotide sequence selected from the group consisting of SEQ ID NO: 1, 

3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57; 
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b) a nucleotide sequence wherein one or more nucleotides in the nucleotide 
sequence selected from the group consisting of SEQ ID NO: 1, 3, 6, 8, 10, 12, 
14, 16, 18, 20, 22, or 57 is changed from that selected from the group 
consisting of the chosen sequence to a different nucleotide provided tfiat no 
more than 15% of the nucleotides are so changed; 

c) a nucleic acid fragment of the sequence selected from the group consisting of 
SEQ ED NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57; and 

d) a nucleic acid fragment wherein one or more nucleotides in the nucleotide 
sequence selected from the group consisting of SEQ ID NO: 1, 3, 6, 8, 10, 12, 
14, 16, 18, 20, 22, or 57 is changed from that selected from the group 

r consisting of the chosen sequence to a different nucleotide provided thaCno 

more than 15% of the nucleotides are so changed. 

10. The nucleic acid molecule of claim 5, wherein said nucleic acid molecule hybridizes 
xmder stringent conditions to the nucleotide sequence selected from the group 
consisting of SEQ ID NO: 1, 3, 6, 8, 10, 12, 14, 16, 18, 20, 22, or 57, or a complement 
of said nucleotide sequence. 

11. The nucleic acid molecule of claim 5, wherein the nucleic acid molecule comprises a 
nucleotide sequence in which any nucleotide specified in the coding sequence of the 
chosen nucleotide sequence is changed from that selected from the group consisting of 
the chosen sequence to a different nucleotide provided that no more than 15% of the 
nucleotides in the chosen coding sequence are so changed, an isolated second 
polynucleotide that is a complement of the first polynucleotide, or a fragment of any of 
them. 

12. A vector comprising the nucleic acid molecule of claim 1 1 . 

13. The vector of claim 12, ftirther comprising a promoter operably linked to said nucleic 
acid molecule. 

14. A cell comprising the vector of claim 12. 



15. 



An antibody that binds inununospecifically to the polypeptide of claim I. 
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16. The antibody of claim 15, wherein said antibody is a monoclonal antibody. 

17. The antibody of claim 15, wherein the antibody is a himianized antibody. 

18. A method for determining the presence or amount of the polypeptide of claim 1 in a 
sample, the method comprising: 

(a) providing said sample; 

(b) introducing said sample to an antibody that binds immunospecifically to the 
polypeptide; and 

(c) determining the presence or amount of antibody bound to said polypeptide, 
thereby determining the presence or amoimt of polypeptide in said sample. 

19. A method for determining the presence or amount of the nucleic acid molecule of 
claim 5 in a sample, the method comprising: 

(a) providing said sample; 

(b) introducing said sample to a probe that binds to said nucleic acid molecule; and 

(c) determining the presence or amount of said probe boxmd to said nucleic acid 
molecule, thereby determining the presence or amoimt of the nucleic acid 
molecule in said sample. 

20. A method of identifying an agent that binds to the polypeptide of claim 1, the method 
comprising: 

(a) introducing said polypeptide to said agent; and 

(b) determining whether said agent binds to said polypeptide. 

21 . A method for identifying a potential therapeutic agent for use in treatment of a 
pathology, wherein the pathology is related to aberrant expression or aberrant 
physiological interactions of the polypeptide of claim 1, the method comprising: 

(a) providing a cell expressing the polypeptide of claim 1 and having a property or 
function ascribable to the polypeptide; 

(b) contacting the cell with a composition comprising a candidate substance; and 

(c) determining whether the substance alters the property or function ascribable to 
the polypeptide; 
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whereby, if an alteration observed in the presence of the substance is not observed when the 

cell is contacted with a composition devoid of the substance, the substance is identified 
as a potential therapeutic agent. 

22. A method for modulating the activity of the polypeptide of claim 1, the method 
comprising introducing a cell sample expressing the polypeptide of said claim with a 
compound that binds to said polypeptide in an amoimt sufficient to modulate the 
activity of the polypeptide. 

23. A method of treating or preventing a pathology associated with the polypeptide of 
claim 1, said method comprising administering the polypeptide of claim 1 to a subject 
in which such treatment or prevention is desired in an amount sufficient to treat or 
prevent said pathology in said subject. 

24. The method of claim 23, wherein said subject is a hmnan. 

25. A method of treating or preventing a pathology associated with the polypeptide of 
claim 1, said method comprising administering to a subject in wliich such treatment or 
prevention is desired a NOVX nucleic acid in an amount sufficient to treat or prevent 
said pathology in said subject. 

26. The method of claim 25, wherein said subject is a human. 

27. A method of treating or preventing a pathology associated with the polypeptide of 
claim 1, said method comprising administering to a subject in which such treatment or 
prevention is desired a NOVX antibody in an amoimt sufiBcient to treat or prevent said 
pathology in said subject. 

28. The method of claim 27, wherein the subject is a human. 

29. A pharmaceutical composition comprising the polypeptide of claim 1 and a 
pharmaceutically acceptable carrier. 

30. A pharmaceutical composition comprising the nucleic acid molecide of claim 5 and a 
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31. A pharmaceutical composition comprising the antibody of claim 1 5 and a 
pharmaceutically acceptable carrier. 

32. A kit comprising in one or more containers, the pharmaceutical composition of claim 
29. 

33. A kit comprising in one or more containers, the pharmaceutical composition of claim 
30. 

34. A kit comprising in one or more containers, the pharmaceutical composition of claim 
31. 

35. The use of a therapeutic in the manufacture of a medicament for treating a syndrome 
associated with a hiuuan disease, the disease selected from a pathology associated with 
the polypeptide of claim 1, wherein said therapeutic is the polypeptide of claim 1. 

36. The use of a therapeutic in the manufacture of a medicament for treating a syndrome 
associated with a human disease, the disease selected from a pathology associated with 
the polypeptide of claim 1, wherein said therapeutic is a NOVX nucleic acid. 

37. The use of a therapeutic in the manufacture of a medicament for treating a syndrome 
associated with a human disease, the disease selected from a pathology associated with 
the polypeptide of claim 1, wherein said therapeutic is a NOVX antibody. 

38. A method for screening for a modulator of activity or of latency or predisposition to a 
pathology associated with the polypeptide of claim 1, said method comprising: 

a) administering a test compoimd to a test animal at increased risk for a pathology 
associated with the polypeptide of claim 1, wherein said test animal 
recombinantly expresses the polypeptide of claim 1; 

b) measuring the activity of said polypeptide in said test animal after 
administering the compound of step (a); and 

c) comparing the activity of said protein in said test animal with the activity of 
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- said polypeptide in a control animal not administered said polypeptide, wherein 

a change in the activity of said polypeptide in said test animal relative to said 
control animal indicates the test compomid is a modulator of latency of, or 
predisposition to, a pathology associated with the polypeptide of claim L 

39. The method of claim 38, wherein said test animal is a recombinant test animal that 
expresses a test protein transgene or expresses said transgene imder the control of a 
promoter at an increased level relative to a wild-type test animal, and wherein said 
promoter is not the native gene promoter of said transgene. 

40. A method for determining the presence of or predisposition to a disease associated with 
altered levels of the polypeptide of claim 1 in a first mammalian subject, the niiethod 
comprising: 

a) measuring the level of expression of the polypeptide in a sample from the first 
mammalian subject; and 

b) comparing the amount of said polypeptide in the sample of step (a) to the 
amount of the polypeptide present in a control sample from a second 
mammalian subject known not to have, or not to be predisposed to, said 
disease, wherein an alteration in the expression level of the polypeptide in the 
first subject as compared to the control sample indicates the presence of or 
predisposition to said disease. 

41 . A method for determining the presence of or predisposition to a disease associated with 
altered levels of the nucleic acid molecule of claim 5 in a first manmtialian subject, the 
method comprising: 

a) measuring the amoimt of the nucleic acid in a sample from the first mammalian 
subject; and 

b) comparing the amoimt of said nucleic acid in the sample of step (a) to the 
amount of the nucleic acid present in a control sample from a second 
manmialian subject known not to have or not be predisposed to, the disease; 
wherein an alteration in ttie level of the nucleic acid in the first subject as 
compared to the control sample indicates the presence of or predisposition to 
the disease. 
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42. A method of treating a patiiological state in a mammal, the method comprising 
administering to the mammal a polypeptide in an amount that is sufficient to alleviate 
the pathological state, wherein the polypeptide is a polypeptide having an amino acid 
sequence at least 95% identical to a polypeptide comprising the amino acid sequence 
selected from the group consisting of SEQ ID NO: 2, 4, 5, 7, 9, 11, 13, 15, 17, 19, 21, 
or 23 or a biologically active fragment thereof 

43 . A method of treating a pathological state in a mammal, the method comprising 
administering to the mammal the antibody of claim 15 in an amount sufficient to 
alleviate the pathological state. 
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